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Abstract 

Many  statically-typed  programming  languages  provide  an  abstract  data  type 
construct,  such  as  the  package  in  Ada,  the  cluster  in  CLU,  and  the  module 
in  Modula2.  However,  in  most  of  these  languages,  instances  of  abstract  data 
types  are  not  first-class  values.  Thus  they  cannot  be  assigned  to  a  variable, 
passed  as  a  function  parameter,  or  returned  as  a  function  result. 

The  higher-order  functional  language  ML  has  a  strong  and  static  type 
system  with  parametric  polymorphism.  In  addition,  ML  provides  type  recon¬ 
struction  and  consequently  does  not  require  type  declarations  for  identifiers. 
Although  the  ML  module  system  supports  abstract  data  types,  their  instanc¬ 
es  cannot  be  used  as  first-class  values  for  type-theoretic  reasons. 

In  this  dissertation,  we  describe  a  family  of  extensions  of  ML.  While  re¬ 
taining  ML’s  static  type  discipline,  type  reconstruction,  and  most  of  its  syn¬ 
tax,  we  add  significant  expressive  power  to  the  language  by  incorporating 
first-class  abstract  types  as  an  extension  of  ML’s  free  algebraic  datatypes.  In 
particular,  we  are  now  able  to  express 

•  multiple  implementations  of  a  given  abstract  type, 

•  heterogeneous  aggregates  of  different  implementations  of  the  same  ab¬ 
stract  type,  and 

•  dynamic  dispatching  of  operations  with  respect  to  the  implementation 
type. 

Following  Mitchell  and  Plotkin,  we  formalize  abstract  types  in  terms  of  ex¬ 
istentially  quantified  types.  We  prove  that  our  type  system  is  semantically 
sound  with  respect  to  a  standard  denotational  semantics. 

We  then  present  an  extension  of  Haskell,  a  non-strict  functional  language 
that  uses  type  classes  to  capture  systematic  overloading.  This  language  re¬ 
sults  from  incorporating  existentially  quantified  types  into  Haskell  and 
gives  us  first-class  abstract  types  with  type  classes  as  their  interfaces.  We 
can  now  express  heterogeneous  structures  over  type  classes.  The  language 
is  statically  typed  and  offers  comparable  flexibility  to  object-oriented  lan- 


guages.  Its  semantics  is  defined  through  a  type-preserving  translation  to  a 
modified  version  of  our  ML  extension. 

We  have  implemented  a  prototype  of  an  interpreter  for  our  language,  in¬ 
cluding  the  type  reconstruction  algorithm,  in  Standard  ML. 


In  memory  of  my  grandfather. 
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1  Introduction 


Many  statically-typed  programming  languages  provide  an  abstract  data  type 
construct,  such  as  the  package  in  Ada,  the  cluster  in  CLU,  and  the  module 
in  Modula2.  In  these  languages,  an  abstract  data  type  consists  of  two  parts, 
interface  and  implementation.  The  implementation  consists  of  one  or  more 
representation  types  and  some  operations  on  these  types;  the  interface  spec¬ 
ifies  the  names  and  types  of  the  operations  accessible  to  the  user  of  the  ab¬ 
stract  data  type.  However,  in  most  of  these  languages,  instances  of  abstract 
data  types  are  not  first-class  values  in  the  sense  that  they  cannot  be  assigned 
to  a  variable,  passed  to  a  function  as  a  parameter  or  returned  by  a  function 
as  a  result.  Besides,  these  languages  require  that  types  of  identifiers  be  de¬ 
clared  explicitly. 

1.1  Objectives 

This  dissertation  seeks  to  answer  the  following  question: 

Is  it  feasible  to  design  a  high-level  programming  language  that  satis¬ 
fies  the  following  criteria: 

1.  Strong  and  static  typing:  If  a  program  is  type-correct,  no  type  errors  oc¬ 
cur  at  runtime. 

2.  Type  reconstruction:  Programs  need  not  contain  any  type  declarations 
for  identifiers;  rather,  the  typings  are  implicit  in  the  program  and  can 
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be  reconstructed  at  compile  time. 

3.  Higher-order  functional  programming :  Functions  are  first-class  val¬ 
ues;  they  may  be  passed  as  parameters  or  returned  as  results  of  a  func¬ 
tion,  and  an  expression  may  evaluate  to  a  function. 

4.  Parametric  polymorphism :  An  expression  can  have  different  types  de¬ 
pending  on  the  context  in  which  it  is  used;  the  set  of  allowable  contexts 
is  determined  by  the  unique  most  general  type  of  the  expression. 

5.  Extensible  abstract  types  with  multiple  implementations'.  The  specifi¬ 
cation  of  an  abstract  type  is  separate  from  its  (one  or  more)  implemen¬ 
tations;  code  written  in  terms  of  the  specification  of  an  abstract  type 
applies  to  any  of  its  implementations;  more  implementations  may  be 
added  later  in  the  program. 

6.  First-class  abstract  types:  Instances  of  abstract  types  are  also  first- 
class  values;  they  can  be  combined  to  heterogeneous  aggregates  of  dif¬ 
ferent  implementations  of  the  same  abstract  type. 

From  a  language  design  point  of  view,  criterion  1  is  important  for  pro¬ 
gramming  safety,  criteria  2,  3,  4,  and  6  are  desirable  for  conciseness  and 
flexibility  of  programming,  and  criterion  5  is  crucial  for  writing  reusable  li¬ 
braries  and  extensible  systems. 

1.2  Approach 

The  functional  language  ML  [MTH90]  already  satisfies  criteria  1  through  4 
fully,  and  criteria  5  and  6  in  a  limited,  mutually  exclusive  way.  For  this  rea¬ 
son  and  for  the  extensive  previous  work  on  the  type  theory  of  ML  and  related 
languages,  we  choose  ML  as  a  starting  point  for  our  own  work. 

In  this  dissertation,  we  describe  a  family  of  extensions  of  ML.  While  re¬ 
taining  ML’s  static  type  discipline  and  most  of  its  syntax,  we  add  significant 
expressive  power  to  the  language  by  incorporating  first-class  abstract  types 

as  an  extension  of  ML’s  free  algebraic  datatypes1.  The  extensions  described 
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are  independent  of  the  evaluation  strategy  of  the  underlying  language;  they 
apply  equally  to  strict  and  non-strict  languages.  In  particular,  we  are  now 
able  to  express 

•  multiple  implementations  of  a  given  abstract  type, 

•  heterogeneous  aggregates  of  different  implementations  of  the  same  ab¬ 
stract  type,  and 

•  dynamic  dispatching  of  operations  with  respect  to  the  implementation 
type. 

Note  that  a  limited  form  of  heterogenicity  may  already  be  achieved  in 
ML  by  building  aggregates  over  a  free  algebraic  datatype.  However,  this  ap¬ 
proach  is  not  satisfactory  because  all  implementations,  corresponding  to  the 
alternatives  of  the  datatype,  have  to  be  fixed  when  the  datatype  is  defined. 
Consequently,  such  a  datatype  is  not  extensible  and  hence  useless  for  the 
purpose  of,  for  example,  writing  a  library  function  that  we  expect  to  work 
for  any  future  implementation  of  an  abstract  type. 

ML  also  features  several  constructs  that  provide  some  form  of  data  ab¬ 
straction.  The  limitations  of  these  constructs  are  further  discussed  in 
Chapter  2. 

1.3  Dissertation  Outline 

The  chapters  in  this  dissertation  are  organized  as  follows: 

•  Chapter  2.  Preliminaries.  In  this  chapter,  we  review  the  preliminary 
notions  and  concepts  used  in  the  course  of  the  dissertation.  First,  we 
give  an  overview  of  the  functional  languages  ML  and  Haskell  and  dis¬ 
cuss  the  shortcomings  of  data  abstraction  in  ML.  Then,  we  describe  the 
untyped  and  several  typed  X-calculi  and  existentially  quantified  types 
as  a  formal  basis  for  our  type-theoretic  considerations.  Further,  we  dis¬ 
cuss  standard  and  order-sorted  unification  algorithms,  which  are  used 

^L’s  version  of  a  variant  record  in  Pascal  or  Ada. 
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in  type  reconstruction  algorithms.  Finally,  we  give  a  review  of  domains 
and  ideals,  which  we  use  as  a  semantic  model  for  the  languages  we  dis¬ 
cuss. 

•  Chapter  3.  An  Extension  of  ML  with  First-Class  Abstract  Types. 

This  chapter  presents  a  semantic  extension  of  ML,  where  the  compo¬ 
nent  types  of  a  datatype  may  be  existentially  quantified.  We  show  how 
datatypes  over  existential  types  add  significant  flexibility  to  the  lan¬ 
guage  without  even  changing  ML  syntax.  We  then  describe  a  determin¬ 
istic  Damas-Milner  type  inference  system  [DM82]  [CDDK86]  for  our 
language,  which  leads  to  a  syntactically  sound  and  complete  type  re¬ 
construction  algorithm.  Furthermore,  the  type  system  is  shown  to  be 
semantically  sound  with  respect  to  a  standard  denotational  semantics. 

•  Chapter  4.  An  Extension  of  ML  with  a  Dotless  Dot  Notation.  In  this 
chapter,  we  describe  a  further  extension  of  our  language.  The  use  of  ex¬ 
istential  types  in  connection  with  an  elimination  construct  (open  or 
abstype)  is  impractical  in  certain  programming  situations;  this  is  dis¬ 
cussed  in  [Mac86].  A  formal  treatment  of  the  dot  notation,  an  alterna¬ 
tive  used  in  actual  programming  languages,  is  found  in  [CL90].  This 
notation  assumes  the  same  representation  type  each  time  a  value  of  ex¬ 
istential  type  is  accessed,  provided  that  each  access  is  via  the  same 
identifier.  We  describe  an  extension  of  ML  with  an  analogous  notation. 
A  type  reconstruction  algorithm  is  given,  and  semantic  soundness  is 
shown  by  translating  into  the  language  from  Chapter  3. 

•  Chapter  5.  An  Extension  of  Haskell  with  First-Class  Abstract 
Types.  This  chapter  introduces  an  extension  of  the  functional  language 

Haskell  [HPJW+92]  with  existential  types.  Existential  types  combine 
well  with  the  systematic  overloading  polymorphism  provided  by 
Haskell  type  classes  [WB89];  this  point  is  first  discussed  in  [L091]. 
Briefly,  we  extend  HaskelFs  data  declaration  in  a  similar  way  as  the 
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ML  datatype  declaration  above.  In  Haskell,  it  is  possible  to  specify 
what  type  class  a  (universally  quantified)  type  variable  belongs  to.  In 
our  extension,  we  can  do  the  same  for  existentially  quantified  type 
variables.  This  lets  us  use  type  classes  as  signatures  of  abstract  data 
types;  we  can  then  construct  heterogeneous  aggregates  over  a  given 
type  class. 

•  Chapter  6.  Related  Work,  Future  Work,  and  Conclusions.  This 
chapter  concludes  with  a  comparison  with  related  work.  Most  previous 
work  on  existential  types  does  not  consider  type  reconstruction;  other 
work  that  includes  type  reconstruction  seems  to  be  semantically  un¬ 
sound.  We  apparently  are  the  first  to  permit  polymorphic  instantiation 
of  variables  of  existential  type  in  the  body  of  the  elimination  construct. 
In  our  system,  such  variables  are  let-bound  and  therefore  polymor¬ 
phic,  whereas  other  work  treats  them  monomorphically.  We  give  an 
outlook  of  future  work,  which  includes  further  extensions  with  mutable 
state  and  a  practical  implementation. 

The  figure  below  illustrates  the  relationship  between  ML,  Haskell,  the 
languages  introduced  in  this  dissertation,  and  other  possible  extensions. 
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2  Preliminaries 


In  this  chapter,  we  review  the  preliminary  notions  and  concepts  used  in  the 
course  of  the  dissertation.  First,  we  give  an  overview  of  the  functional  lan¬ 
guages  ML  and  Haskell  and  discuss  in  detail  the  shortcomings  of  data  ab¬ 
straction  in  ML.  Then,  we  describe  the  untyped  and  several  typed  X-calculi 
and  existentially  quantified  types  as  a  formal  basis  for  our  type-theoretic 
work  below.  Further,  we  discuss  standard  and  order-sorted  unification  algo¬ 
rithms,  which  are  used  in  type  reconstruction  algorithms  for  implicitly  typed 
languages.  Finally,  we  give  a  brief  review  of  domains  and  ideals,  which  we 
use  as  a  semantic  model  for  the  languages  we  discuss. 

2.1  The  Languages  ML  and  Haskell 

This  section  gives  an  overview  of  the  functional  languages  ML  and  Haskell 
and  discusses  the  shortcomings  of  the  data  abstraction  constructs  provided 
by  ML.  We  assume  some  general  background  in  programming  languages; 
prior  exposure  to  a  statically  typed  functional  language  is  helpful. 

2.1.1  ML 

We  present  a  few  programming  examples  that  illustrate  the  relevant  core  of 
ML  [MTH90]  and  its  type  system.  For  a  full  introduction,  see  [Har90].  The 
syntax  of  core  expressions  is  defined  recursively  as  constants,  identifiers, 
and  three  constructs: 
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Constants 

c 

::=  0  111  ... 

Identifiers 

X 

::=  x  1  y  1  ... 

Abstractions 

f 

::=  fn  x  =>  e 

Applications 

a 

::=  e  e' 

Bindings 

b 

::=  let  val  x  =  e  in  e 

Expressions 

e 

::=  c\x\f  \  a  \  b 

We  also  assume  that  a  conditional  construct  if  and  a  fixed-point  operator 
fix  are  predefined. 

To  bind  an  identifier,  we  can  just  write 

val  x  =  e 

which  corresponds  to  an  implicit  let  binding  whose  body  encompasses  the 
rest  of  the  program.  For  functions,  we  can  write 

fun  f  x  =  e 

instead  of 

val  rec  f  =  fn  x  =>  e 

If  the  function  is  not  recursive,  that  is,  f  is  not  called  in  e ,  the  keyword  rec 
may  be  omitted.  The  simplest  polymorphic  function  is  the  identity  function, 
given  by 

fn  x  =>  x 

which  simply  returns  its  argument.  Its  semantics  is  clearly  independent  of 
the  type  of  its  argument.  The  following  is  an  example  of  a  higher-order  func¬ 
tion  definition: 


fun  compose  f  g  =  fn  x  =>  f(g(x)) 
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For  the  expression  f  (g(x)  )  to  be  well-typed,  the  following  assumptions 
about  the  types  of  f ,  g,  and  x  must  hold  for  some  types  'a,  '  b,  and  '  c1 . 

x  :  'a 
f  :  ' b  ->  '  c 
g  :  ' a  ->  'b 

Under  these  assumptions,  compose  has  the  type 

compose  :  ('b  ->  ' c)  ->  ( ' a  ->  'b)  ->  ('a  ->  c) 

where  ->  is  the  function  type  constructor.  Since  this  assumption  holds  for 
any  types  'a,  '  b,  and  '  c,  we  can  think  of  this  type  as  universally  quantified 
type  scheme  over  the  type  variables,  written  as 

VaVpVy.  (|3  — >  y)  ->  (a  ff)  — >  (a  — >  y) 

We  can  now  define  a  function  that  composes  another  function  with  itself: 

fun  twice  f  =  compose  f  f 

The  type  inferred  for  twice  is 

twice  :  ( ' a  ->  'a)  ->  ( ' a  ->  'a) 

and  we  can  apply  twice  as  follows: 

fun  succ  x  =  x  +  1 
(twice  succ)  3 

evaluating  to  5.  It  is  important  to  note  that  in  the  definition  of  twice,  both 
occurrences  of  the  argument  f  are  required  to  have  the  same  type.  Conse¬ 
quently,  '  a  =  '  b  =  '  c  in  this  instance  of  compose. 

The  parameters  of  a  function  abstraction,  henceforth  called  X-bound 
identifiers,  behave  differently  from  let-bound  identifiers: 

•  All  occurrences  of  a  X-bound  identifier  have  to  have  the  same  type. 

1  ML  uses  quoted  letters  to  represent  the  Greek  letters  often  used  in  type  expres¬ 
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•  Each  occurrence  of  a  let-bound  identifier  may  have  a  different  type, 
which  has  to  be  an  instance  of  the  most  general  or  principal  type  in¬ 
ferred  for  that  identifier. 

Furthermore,  ML  has  a  built-in  type  constructor  for  (homogeneous)  lists, 
which  is  parameterized  by  the  element  type.  Predefined  constants  and  func¬ 
tions  on  lists  include: 


nil 

'  a 

list 

:  : 

'  a 

*  '  a 

list  ->  ' a 

hd 

'  a 

list 

-> 

'  a 

tl 

'  a 

list 

-> 

' a  list 

null 

'  a 

list 

-> 

bool 

Lists  are  written  in  the  form 
[ *i' 

For  instance, 

(compose  hd  hd)  [ [1, 2] ,  [3, 4, 5] ] 

is  type-correct  and  evaluates  to  1,  while 

(twice  hd)  [[1,2], [3,4,5]] 

is  not  type-correct,  since  the  type  of  hd  is  not  of  the  form  '  a  ->  'a. 

Lastly,  ML  offers  user-defined  free  algebraic  datatypes.  A  datatype  dec¬ 
laration  of  the  form 

datatype  [ arg ]  T  =  Kx  of  Xj  |  ...  |  Kn  of  %n 

declares  a  type  (or  a  type  constructor,  if  arguments  are  present)  T,  where 
Kj’s  are  value  constructor  functions  of  types  x.  — >  (arg  T )  .  Value  construc¬ 
tors  can  also  lack  the  argument,  in  which  case  they  are  constants.  The  pre¬ 
defined  list  type  can  actually  be  written  as  a  datatype: 


datatype  a  list  =  nil  |  cons  of  'a  *  'a  list 
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Values  whose  type  is  such  a  datatype  can  be  constructed  by  applying  a  value 
constructor  to  an  argument  of  appropriate  type.  They  can  be  deconstructed 
by  means  of  a  pattern-matching  let  expression  of  the  (simplified)  form 

let  val  K  x  -  e  in  e' 

For  example, 

val  cons(x,xs)  =  [1,2,3] 

would  decompose  the  list  on  the  right-hand  side,  binding  x  to  1  and  xs  to 

[2,3]. 

2.1.2  Shortcomings  of  Abstract  Type  Constructs  in  ML 

ML  already  provides  three  distinct  constructs  that  can  be  used  to  describe 
abstract  data  types: 

•  The  abstype  mechanism  is  used  to  declare  an  abstract  data  type  with 
a  single  implementation.  It  has  been  partially  superseded  by  the  mod¬ 
ule  system. 

•  The  ML  module  system  provides  signatures,  structures,  and  functors. 
Signatures  act  as  interfaces  of  abstract  data  types  and  structures  as 
their  implementations;  functors  are  essentially  parametrized  struc¬ 
tures.  Several  structures  may  share  the  same  signature,  and  a  single 
structure  may  satisfy  several  signatures.  However,  structures  are  not 
first-class  values  in  ML  for  type-theoretic  reasons  discussed  in 
[Mac86]  [MH88].  This  leads  to  considerable  difficulties  in  a  number  of 
practical  programming  situations.  The  following  example  illustrates 
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how  an  abstract  type  STACK  is  programmed  in  the  ML  module  system: 

signature  ELEM  =  sig 
type  elem 
val  init  :  elem 

end 

signature  STACK 
type  elem 
type  stack 
val  empty 
val  push 
val  pop 
val  top 
val  isempty 

end 

functor  ListStack (Elem  :  ELEM)  :  STACK  =  struct 
type  elem  =  Elem. elem 
type  stack  =  elem  list 
val  empty  =  [ ] 

fun  push  x  xs  =  x  : :  xs 

val  pop  =  tl 

val  top  =  hd 

val  isempty  =  null 

end 

functor  ArrayStack (Elem  :  ELEM)  :  STACK  =  struct 
type  elem  =  Elem. elem 
type  stack  =  int  ref  *  elem  array 
val  maxElem  =  100 
val  empty  = 

(ref  0 , Array . array (maxElem, Elem. init ) ) 
fun  push  x  (i,s)  = 

(inc  i;  Array .update (s, ! i, x) ;  (i,s)) 

fun  pop(i,s)  =  (dec  i;  (i,s)) 
fun  top(i,s)  =  Array . sub (s ,! i) 


=  sig 

:  stack 

:  elem  ->  stack  ->  stack 
:  stack  ->  stack 
:  stack  ->  elem 
:  stack  ->  bool 
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fun  isempty(i,s)  =  !i  =  0 

end 

structure  IntElem  =  struct 
type  elem  =  int 
val  init  =  0 

end 

structure  IntList Stack  =  List Stack (IntElem) 
structure  IntArrayStack  =  ArrayStack (IntElem) 

Note  that  two  different  implementations  of  STACK  are  given.  However, 
the  types  IntListStack  .  stack  and  IntArrayStack .  stack  are 

different;  thus  we  cannot  construct,  for  example,  the  following  list: 

[IntListStack . empty, IntArrayStack . empty] 

•  Abstract  data  types  can  be  implemented  as  a  tuple  (or  record)  of  clo¬ 
sures;  the  hidden  bindings  shared  between  the  closures  correspond  to 
the  representation,  and  the  closures  themselves  correspond  to  the  oper¬ 
ations.  The  type  of  the  tuple  corresponds  to  the  interface.  A  discussion 
of  this  approach  is  found  in  [Ode91].  The  following  example  illustrates 
a  use  of  a  heterogeneous  list  of  int  Stack’s. 

datatype  ' a  Stack  = 

stack  of  {empty  :  unit  ->  'a  Stack, 

push  :  ' a  ->  ' a  Stack, 

pop  :  unit  ->  'a  Stack, 

top  :  unit  ->  'a, 

isempty  :  unit  ->  bool) 

fun  makeListStack  xs  = 

stack {empty  =  fn()  =>  makeListStack  [], 

push  =  fn  x  =>  makeListStack (x :: xs) , 
pop  =  fn()  =>  makeListStack (tl  xs) , 

top  =  fn ( )  =>  hd  xs, 

isempty=  fn()  =>  null  xs} 


14 


Chapter  2  Preliminaries 


fun  makeArrayStack  xs  =  . . . 


fun  empty  ( stack { empty=e,  ...}  )  =  e() 

fun  push  y  ( stack {push=pu, . . .})  =  pu  y 

fun  pop  ( stack {pop=po, . . . } )  =  po() 

fun  top  ( stack {top=t , . . . } )  =  t() 

fun  isempty (stack {isempty=i, ...} )  =  i() 


map  (push  8)  [makeListStack  [2,4,6], 
makeArrayStack  [3,5,7]] 

The  shortcoming  of  this  approach  is  that  the  internal  representation  of 
an  instance  of  an  abstract  type  is  completely  encapsulated;  consequent¬ 
ly,  the  extensibility  of  the  abstract  type  is  severely  limited.  The  next 
example  of  an  abstract  type  Mult  supporting  a  square  operation  il¬ 
lustrates  this  limitation: 

datatype  Mult  =  mult  of  { square :  unit  ->  Mult } 
fun  makeMult (i, f )  = 

mult {square  =  fn()  =>  makeMult (f (i, i) , f) } 
fun  square (mult { square=s } )  =  s() 
map  square 

[makeMult ( 3 ,  op  * :  int  *  int  ->  int ) , 
makeMult ( 7 . 5 ,  op  *:  real  *  real  ->  real)] 

The  problem  arises  when  we  want  to  define  an  additional  operation  on 
Mult,  say  cube.  In  this  case,  we  need  to  add  another  field  to  the  record 
component  type  of  Mult,  and  we  even  need  to  change  the  definitions 
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of  makeMult  and  square,  although  the  latter  was  defined  outside  of 

makeMult: 

datatype  Mult  =  mult  of  { square :  unit  ->  Mult , 

cube  :  unit  ->  Mult) 

fun  makeMult (i, f)  = 

multfsquare  =  fn()  =>  makeMult (f (i, i) , f) , 

cube  =  fn()  =>  makeMult (f (i, f (i, i) ), f) } 

fun  square (mult { square=s ,...} )  =  s() 

It  is  possible  to  work  around  this  limitation  using  nested  records  as  de¬ 
scribed  in  [0de91]. 

Another,  more  serious  limitation  of  the  encapsulation  imposed  by  the 
closure  approach  becomes  apparent  when  we  model  abstract  types  with 
operations  involving  another  argument  of  the  same  abstract  type.  Con¬ 
sider  the  following  attempt  at  describing  an  abstract  type  Tree: 


datatype  Tree  =  tree  of  {eq 

Tree  ->  bool. 

right 

unit  ->  Tree, 

left 

unit  ->  Tree, 

•  •  •  } 

The  eq  function  could  then  be  implemented  by  converting  two  trees  to 
a  common  representation  and  comparing  them.  Suppose  now  that  we 
want  to  compare  two  subtrees  of  the  same  tree.  There  is  no  obvious  way 
to  take  advantage  of  the  knowledge  that  both  subtrees  have  the  same 
representation;  they  still  need  to  be  converted  before  the  comparison. 

2.1.3  Haskell 

The  functional  programming  language  Haskell  [HPJW+92]  has  a  polymor¬ 
phic  type  discipline  similar  to  ML’s.  In  addition,  it  uses  type  classes  as  a  sys¬ 
tematic  approach  to  operator  overloading.  Type  classes  capture  common  sets 
of  operations,  for  example  multiplication,  which  is  common  to  both  int  and 
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real  types.  A  particular  type  may  be  an  instance  of  a  type  class  and  has  an 
operation  corresponding  to  each  operation  defined  in  the  type  class.  Further, 
type  classes  may  be  arranged  in  a  class  hierarchy,  in  the  sense  that  a  derived 
type  class  captures  all  operations  of  its  superclasses  and  may  add  new  ones. 
Type  classes  were  first  introduced  in  the  article  [WB89],  which  also  gives 
additional  motivating  examples  and  shows  how  Haskell  programs  are  trans¬ 
lated  to  ML  programs. 

The  syntax  of  the  Haskell  core  consists  of  essentially  the  same  expres¬ 
sions  as  the  ML  core,  with  the  addition  of  class  and  instance  declarations  of 
the  following  form: 

class  C  a  where 

opj  :  :  Xj 

•  •  x 

*  n  n 

instance  C  t  where 

op  i  =  el 

op  =  e 
yn  n 

To  motivate  the  type  class  approach,  consider  the  overloading  of  mathe¬ 
matical  operators  in  ML.  Although  4*4  and  4 . 7*4 . 7  are  valid  ML  expres¬ 
sions,  we  cannot  define  a  function  such  as 

fun  square  x  =  x  *  x 

in  ML,  as  the  overloading  of  the  operator  *  cannot  be  resolved  unambigu¬ 
ously.  In  Haskell,  we  first  declare  a  class  Num  to  capture  the  operations  Int 
and  Float  have  in  common: 

class  Num  a  where 
(-)  : :  a  ->  a 

(+)  : :  a  ->  a  ->  a 

(*) 


a  ->  a  ->  a 
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At  this  point,  we  can  already  type  the  square  function,  although  we  cannot 
use  it  yet,  since  we  do  not  have  any  instances  of  Num.  The  typing  is 

square  : :  Num  a  =>  a  ->  a 

which  reads,  “for  any  a  that  is  an  instance  of  Num,  square  has  type 
a  ->  a.”  We  then  declare  two  instances  of  Num,  assuming  the  existence  of 
some  predefined  functions  on  Int  and  Float: 

instance  Num  Int  where 
(-)  =  intUMinus 
(+)  =  int Add 
( * )  =  intMult 

instance  Num  Float  where 
(-)  =  floatUminus 
(+)  =  float Add 
(*)  =  floatMult 

When  we  now  write  square  4 . 0,  the  type  reconstructor  finds  out  that  4 . 0 
is  of  type  Float,  which  in  turn  is  an  instance  of  Num.  The  multiplication 
used  is  floatMult,  as  specified  in  the  instance  declaration  for  Float.  Giv¬ 
en  a  definition  of  the  function  map,  we  can  write  the  function 

squarelist  xs  =  map  square  xs 

which  squares  each  element  in  a  list.  It  has  type 

squarelist  : :  Num  a  =>  [a]  ->  [a] 

where  [a]  is  the  Haskell  version  of  'a  list. 

Haskell  also  provides  algebraic  datatypes,  which  differ  from  the  ones  in 
ML  only  in  that  the  formal  arguments  of  the  type  constructor  can  be  speci¬ 
fied  to  be  instances  of  a  certain  type  class. 

It  should  also  be  mentioned  that  Haskell  is  a  pure ,  non-strict  functional 
language,  whereas  ML  is  a  strict  language  and  provides  mutable  state  in  the 
form  of  references. 
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2.2  The  Lambda  Calculus,  Polymorphism,  and 
Existential  Quantification 

In  this  section,  we  describe  the  untyped,  the  simply  typed,  and  the  first-order 
polymorphic  X-calculi,  which  constitute  the  type-theoretic  basis  for  func¬ 
tional  languages  such  as  ML  and  Haskell.  We  also  give  an  introduction  to 
existentially  quantified  types,  which  provide  a  type-theoretic  description  of 
abstract  data  types. 

2.2.1  The  Untyped  A,-calculus 

The  untyped  X-calculus  is  a  formal  model  of  computation.  While  the  X-cal- 
culus  is  equivalent  to  Turing  machines  in  computational  power,  its  simple, 
functional  structure  lends  itself  as  a  useful  model  for  reasoning  about  pro¬ 
grams,  in  particular,  functional  programs.  We  give  a  brief  introduction  to  the 
?i-calculus;  a  comprehensive  reference  is  [Bar84]. 

X-terms  are  defined  as  follows: 

Constants1  c 

Identifiers  x 

Terms  e  ::=  c  I  x  I  fx.e  \  ( e  e') 

In  a  ^-abstraction  a  of  the  form  fx.e,  where  e  is  some  ?i-term,  the  variable 
x  is  said  to  be  bound  in  a  and  is  called  a  bound  variable.  Any  variable  y  in 

e  other  than  x  that  is  not  bound  in  a  ^-abstraction  inside  e  is  said  to  occur 

free  in  a  and  is  called  &  free  variable.  We  assume  that  no  free  variable  is 
identical  to  any  bound  variable  within  a  X-term. 

The  X-calculus  provides  several  conversion  rules  for  transforming  one  91- 
term  into  an  equivalent  one.  The  conversion  rules  are  defined  as  follows: 

'Constants  are  not  actually  part  of  the  pure  A,-calculus,  but  are  a  useful  enrich¬ 


ment. 
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•  (3- conversion : 

(Xv.  e)  e'  <=>  e  [ e'/x ] 

This  rule  models  function  application.  It  states  that  a  ^-abstraction 
Xx.  e  is  applied  to  a  term  e'  by  replacing  each  free  occurrence  of  x  in 
e  by  a  copy  of  e' .  In  addition,  bound  variables  in  e  have  to  be  renamed 
to  avoid  name  conflict  with  variables  that  are  free  in  e' .  e  [e'/x]  stands 
for  this  new  term. 

•  a -conversion'. 

Xx.  e  <=>  Xy.  e  [y/x]  y  £  FV(e) 

This  rule  states  that  the  bound  variable  of  a  ^-abstraction  may  be  re¬ 
named,  provided  that  the  renamed  variable  does  not  occur  free  in  e. 

•  r\-conversion\ 

Xx.  (e  x)  <=>  e  x  £  FV(e) 

This  rule  can  be  used  to  eliminate  a  redundant  ^-abstraction,  provided 
that  the  bound  variable  does  not  occur  free  in  e. 

•  8- conversion : 

The  8-rules  define  conversion  of  built-in  constants  and  functions,  for 
example, 

( times  3  4)  <=>12 

We  view  the  set  of  X-terms  as  divided  into  a -equivalence  classes ;  this  means 
that  any  two  X-terms  that  can  be  transformed  into  one  another  via  a-conver- 
sion  are  in  the  same  equivalence  class,  and  any  one  term  is  viewed  as  a  rep- 
resentant  of  its  a-equivalence  class. 


20 


Chapter  2  Preliminaries 


While  conversion  rules  express  that  two  terms  are  equivalent,  reduction 
rules  are  used  to  evaluate  a  term.  There  are  two  reduction  rules,  P-reduction 
and  8-reduction;  the  most  important  rule,  P-reduction,  is  given  by 

(Xx.  e)  e'  =>  e  [ e'/x ] 

Semantically,  a  X-term  is  evaluated  by  repeatedly  applying  reduction  rules 
until  no  more  reductions  can  be  applied;  the  resulting  term  is  said  to  be  in 
normal  form.  A  given  X-term  may  have  several  subterms  to  which  P-reduc- 
tion  can  be  applied;  such  subterms  are  called  reducible  terms  or  redexes.  An 
evaluation  strategy  where  P-reduction  is  always  applied  to  the  leftmost  out¬ 
ermost  redex  first  is  called  is  called  normal  order  evaluation.  A  strategy 
where  P-reduction  is  always  applied  to  the  leftmost  innermost  redex  is 
called  applicative  order  evaluation.  In  programming  languages,  normal  or¬ 
der  evaluation  is  often  implemented  by  lazy  or  call-by-name  evaluation,  and 
applicative  order  evaluation  is  a  special  case  of  eager  (call-by-value)  eval¬ 
uation.  Normal  order  evaluation  is  normalizing,  which  means  that  it  termi¬ 
nates  for  every  term  that  has  a  normal  form.  Although  applicative  order 
evaluation  does  not  guarantee  termination,  it  is  sometimes  preferred  in  prac¬ 
tice  for  efficiency  reasons. 

In  the  X-calculus,  recursion  is  expressed  by  the  Y  combinator,  which  is 
defined  by  the  equation  Yf  =  f(Yf)  .  The  Y  combinator  can  be  defined  by  the 
following  ^-abstraction: 

Y  =  Xh.  ( (Xx.  (h  (xx) ) )  (Xx.  (h  (xx) ) ) ) 

A  recursive  function  can  then  be  expressed  as  a  X-term  containing  Y,  for  ex¬ 
ample  the  factorial, 

Y  ( Xf.Xn .  (if  (equcd  n  0)  1  (times  n  (f  (minus  n  1) ) ) ) ) 
assuming  suitable  8-rules  for  the  built-in  functions  used. 
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2.2.2  The  Simply  Typed  A,-Calculus 

Typed  X-calculi  are  like  the  untyped  X-calculus,  except  that  every  bound 
identifier  is  given  a  type.  The  simply  typed  X-calculus  describes  languages 
that  have  a  notion  of  type.  Informally,  types  are  subsets  of  the  set  of  all  val¬ 
ues  that  share  a  certain  common  structure,  for  example  all  integers,  or  all 
Booleans.  An  important  difference  between  typed  and  untyped  calculi  is  that 
typed  calculi  introduce  the  notion  of  (static)  type  correctness  of  a  term, 
which  one  would  like  to  check  before  trying  to  evaluate  the  term.  The  un¬ 
typed  X-calculus  could  be  regarded  as  a  typed  X-calculus  in  which  each  iden¬ 
tifier  or  constant  has  the  same  type  general  and  all  terms  are  type-correct. 
An  comprehensive  survey  of  typing  in  programming  languages  is  [CW85]. 

As  an  example,  consider  the  successor  function,  which  we  could  define 
as 


succ  =  Xn  :  int.n  +  1 

Assuming  the  typing  +  :  int  x  int  — >  int ,  we  would  obtain  the  typing 
succ  :  int  — > int 

where  — >  is  the  function  type  constructor  and  x  the  tuple  type  constructor 
used  for  multiple  function  arguments.  We  could  then  define 

twice  =  Xf:  int  — >  int  .Xx  :  int.fif  x ) 

and  the  term 

( twice  succ)  4 

would  be  type-correct  and  result  in  6.  On  the  other  hand, 
twice  7 


would  not  be  type-correct,  since  the  type  of  7  is  int  rather  than  int  — >  int . 
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We  would  like  to  formalize  the  notion  of  type  correctness.  For  example, 
to  guarantee  that  a  function  application  is  type-correct,  it  is  enough  to  know 
that  the  argument  term  is  of  the  same  type  as  the  domain  type  of  the  function. 
Then  the  type  of  the  resulting  term  is  the  range  type  of  the  function.  Such  a 
rule  is  formally  expressed  by  an  inference  rule  consisting  of  zero  or  more 
antecedents  and  one  conclusion.  Each  antecedent  or  conclusion  is  a  type 
judgment  of  the  form  A  \-  e  :  x  where  e  is  a  well-formed  term,  x  a  well- 

formed^  type,  and  A  a  set  of  assumptions  of  the  form  x  :  x  stating  that  the 
identifier  or  constant  x  has  type  x;  (-  reads  as  “entails.”  For  example,  the 
rule  governing  function  application  is  written  as 

A  |-  e  :  x'  ->  x  A  \-  e’  :  x' 

A  \-  (e  e")  :  x 

and  is  read  as:  “If  assumption  set  A  entails  type  x'  — >  x  for  expression  e  and 
if  A  entails  type  x'  for  expression  e',  then  A  entails  type  x  for  the  applica¬ 
tion  (e  e')  . 

The  type  system  of  a  typed  X-calculus  is  described  by  a  system  of  such  in¬ 
ference  rules.  Type-correct  terms  are  those  for  which  a  type  judgment  can 
be  derived  within  the  given  inference  system. 

The  following  inference  system  describes  the  simply  typed  X-calculus: 


(TAUT)  A  h  x  :  A(x) 


(APP) 


A  |-  e  :  x'  ->  x  A  |-  e'  :  x' 
A  (-  (e  e’)  :  x 


(ABS) 


A  [x'/x]  (-  e  :  x 
A  |-  Xx  :  z'.e  :  x'  ->  x 


^or  our  purposes,  types  are  well-formed  iff  they  are  composed  from  the  basic 
types  int ,  bool ,  etc.,  by  application  of  the  type  constructors  — >  and  X  . 
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where  A  [ x'/x ]  stands  for  the  assumption  set  A  extended  with  the  assump¬ 
tion  x  :  x. 

One  possible  proof  for  the  typing  twice  :  ( int  — >  int)  — >  ( int  — >  int ) 
looks  as  follows1: 


inf  /  x 
— »  int /f 
int/x 

int  — >  int  /f_ 


|-  ^  :  int 
)-  / :  int  — »  int 


int/x 


h  / :  int 


int 


int/x 

int  — >  int/f_ 


( -  f  x  :  int 


int/x 

Jnt  — »  int/f_ 

b  f(f  x)  :  int 

[int  — >  int/f]  (-  Xx 

:  int.fif.x)  :  int  int 

0  |-  (A/:  int  — >  int.'kx  :  int .f(f  x) )  :  ( z>z/  — >  int )  — >  (int  —>  int ) 


When  designing  a  statically  typed  programming  language,  we  generally 
want  type  correctness  to  be  decidable.  That  is,  we  would  like  to  have  an  al¬ 
gorithm  that  decides,  given  a  type  judgment,  whether  there  is  a  proof  for  this 
type  judgment.  We  would  also  like  the  type  system  to  be  semantically  sound, 
meaning  that  a  type-correct  program  can  be  evaluated  without,  for  example, 
trying  to  apply  an  argument  to  a  term  that  is  not  a  function. 

2.2.3  The  Typed  ^-Calculus  with  let-Polymorphism 

The  typed  ?i-calculus  with  let-polymorphism  is  a  formalization  of  the  idea 
that  there  are  ^-abstractions  that  have  many  different  types  depending  on 
their  argument  terms.  It  provides  a  type-theoretic  model  for  the  language 
ML  described  in  Section  2.1.1.  As  a  motivating  example,  consider  that  the 
result  of  the  (untyped)  ^-abstraction 

id  =  Xx.x 


*The  horizontal  bars  are  read  in  top-down  order. 
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always  has  the  same  type  as  its  argument.  We  can  express  this  by  giving  id 
the  universally  quantified  type  id  :  Va.a->a,  where  a  is  a  type  variable 
representing  any  well-formed  type;  this  typing  is  read  as:  “for  every  type  a, 
id  has  type  a  — >  a.”  While  Va.a  — >  a  is  not  a  well-formed  type,  it  makes 
sense  to  think  of  a  as  a  type  parameter,  which  can  be  instantiated  to  the  type 
of  the  argument  passed  to  id.  We  therefore  call  constructs  such  as 
Va.a— >  a  type  schemes  or  polymorphic  types,  whereas  types  that  are  not 
universally  quantified  are  called  monomorphic  types.  In  our  typed  calculus, 
we  can  think  of  id  as  first  parameterized  by  the  argument  type  and  then  by 
the  argument  itself.  This  is  expressed  by  a  ^-abstraction  enclosed  by  a  A- 
abstraction,  which  denotes  abstraction  over  a  type  argument: 

id  =  Aa.X.r  :  a.x 

and  its  application  to  an  argument  has  the  form 
id  [ int ]  3, 

where  type  arguments  to  a  A-abstraction  are  enclosed  in  square  brackets. 
Terms  are  called  polymorphic  or  monomorphic  depending  on  their  type.  In 
the  typed  X-calculus  with  let-polymorphism,  we  do  not  allow  arguments  of 
^-abstractions  to  be  polymorphic.  Consequently,  a  A-abstraction  can  only 
occur  at  the  outermost  level  of  a  term  or  on  the  right  side  of  a  special  binding 
construct  that  expresses  the  binding  of  a  term  to  an  identifier.  This  construct 
is  called  let-expression  and  is  of  the  form  let  x  =  e  in  e' . 

The  following  inference  system  describes  the  typed  X-calculus  with  let- 
polymorphism,  where  x’s  stand  for  types  and  a’s  for  type  schemes. 

(TAUT)  A  \-  x  :  A(x) 

A  h  e  :  x'  -4  x  A  \-  e'  :  x' 


(APP) 


A  (-  (e  e')  :  x 
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(ABS) 


A  [x'/jc]  h  e  :  x 
A  |-  :  z'.e  :  x'  — >  x 


(LET) 


A  \-  e  :  o  A  [o/x]  |-  <?'  :  x 
A  |-  let  x  =  e  in  e'  :  x 


(INST) 


A  \-  e  :  Va.o 
A  (-  e  [x]  :  o  [x/a] 


(GEN) 


A  \-  e  :  o  a<£  FV{A) 
A  \-  Aa.e  :  Va.o 


Note  that  the  ABS  rule  requires  the  expression  from  which  the  abstraction 
is  constructed  to  be  monomorphic,  and  the  APP  rule  enforces  that  in  an  ap¬ 
plication  the  function  and  its  argument  have  to  be  monomorphic. 

The  following  is  a  sample  proof  in  this  system: 

[a/x]  |-  x  :  a 
0  h  Xx\  a.x  :  a  — >  a 
0  |-  Aa.Xx  :  a.x  :  Va.a  — >  a 

[Va.a  — >  a/ id]  |-  id  [ int)  :  ini  ini 
[Va.a  a/ id]  [-  3  :  int 

[Va.a  — >  a/ id]  |-  id  [int]  3  :  int 
0  |-  let  id  =  Aa.Xx  :  a.x  in  id  [int]  3  :  int 

The  ML  core  language  can  be  thought  of  as  an  implicitly  typed  version 
of  the  typed  X-calculus  with  let-polymorphism;  this  is  discussed  in  detail 
in  [MH88].  ML  uses  type  reconstruction  to  compute  the  explicit  type  anno¬ 
tations  of  an  implicitly  typed  expression.  The  problem  of  polymorphic  type 
reconstruction  was  first  discussed  in  [Mil78]  and  further  developed  in 
[DM82]  and  [Dam85], 
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2.2.4  Higher-Order  Typed  A,-Calculi 

A  considerable  amount  of  research  has  focused  on  the  second-order  typed  X- 
calculus  and  higher-order  systems  [REFS].  Since  all  languages  presented  in 
this  dissertation  are  extensions  of  the  typed  X-calculus  with  let-polymor¬ 
phism,  we  do  not  further  discuss  higher-order  calculi  here. 

2.2.5  Existential  Quantification 

Existentially  quantified  types,  or  in  short,  existential  types,  are  a  type-theo¬ 
retic  formalization  of  the  concept  of  abstract  data  types,  which  are  featured 
in  different  forms  by  various  programming  languages. 

In  Ada,  abstract  types  are  expressed  via  private  types.  Consider  as  an  ex¬ 
ample  the  following  package  specification,  which  describes  an  stack  of  in¬ 
tegers: 

package  STACK_PKG  is 

type  STACK_TYPE  is  private; 
procedure  PUSH (in  out  S:  STACK_TYPE ; 

A  :  INTEGER) ; 

procedure  POP (in  out  S:  STACK_TYPE) ; 
private 

type  S TACK_T YP E  is  . . .  ; 

end  STACK_PKG ; 

We  can  then  write  in  our  program 

use  STACK_PKG ; 

and  have  access  to  the  entities  defined  in  the  package  specification  without 
knowing  or  wanting  to  know  how  STACK_TYPE  is  defined  in  the  package 
body.  Since  the  program  using  the  package  works  independently  of  the  im¬ 
plementation  of  the  package,  we  might  wonder  what  type  STACK_TYPE 
stands  for  in  the  program.  An  informal  answer  is,  “some  new  type  that  is  dif¬ 
ferent  from  any  other  type  in  the  program.” 
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Existential  quantification  is  a  formalization  of  the  notion  of  abstract 
types;  it  is  described  in  [CW85]  and  further  explored  in  [MP88].  By  stating 
that  an  expression  e  has  existential  type  3a.  x,  we  mean  that  for  some  fixed, 

unknown  type  x,  e  has  type  x  [x/ a]  .  e  can  thus  be  viewed  as  a  pair  consist¬ 
ing  of  a  type  component  x  and  a  value  component  of  type  x  [x/ a]  .  The  com¬ 
ponents  are  accessed  through  an  elimination  construct  of  the  form 

open  e  as  (t,  x)  in  e' 

In  e' ,  the  type  t  stands  for  the  hidden  representation  type  of  e ,  such  that  x 
can  be  used  in  e'  with  type  x  [t/ a]  .  To  guarantee  static  typing,  the  type  of 
e'  must  not  contain  t. 

Values  of  existential  type  are  created  using  the  construct 
pack  <a  =  x,  e  :  x) 

where  a  may  occur  free  in  x.  The  type  of  this  expression  is  3a. x,  and  at  his 
point  we  no  longer  know  that  the  expression  we  packed  originally  had  type 
x  [x/a] . 

A  different  formulation  of  existential  quantification  called  the  clot  nota¬ 
tion,  closer  to  actual  programming  languages,  is  described  in  [CL90]. 

2.3  Type  Reconstruction 

In  this  section,  we  describe  the  Damas-Milner  approach  to  type  reconstruc¬ 
tion  in  ML  [Mil78]  [DM82]  [Dam85]  and  its  application  to  type  reconstruc¬ 
tion  in  Haskell  [NS91]. 

2.3.1  Type  Reconstruction  for  ML 

Before  we  present  the  type  inference  system  and  the  type  reconstruction  al¬ 
gorithm  for  the  ML  core,  we  need  to  define  the  following  terms: 

•  A  substitution  is  a  finite  mapping  from  type  variables  to  types.  It  is  of- 
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ten  written  in  the  form  [Tj/oCj,  ....  x  /(X  ]  and  applied  as  a  postfix  op¬ 
erator;  it  can  also  be  given  a  name,  for  example,  S,  and  applied  as  pre¬ 
fix  operator.  If  o  =  VPj.-.p  .x  is  a  type  scheme,  then  So  is  the  type 

scheme  obtained  from  o  by  replacing  each  free  occurrence  of  a.  in  o 
by  x.,  renaming  the  bound  variables  of  o  if  necessary.  Let  Id  denote 
the  identity  substitution  [  ]  . 

•  Type  x  is  a  principal  type  of  expression  e  under  assumption  set  A  if 
A  |-  e  :  x  and  whenever  A  (-  e  :  x'  then  there  is  a  substitution  S  such 
that  St  =  x'. 

We  now  give  a  type  inference  system  that  describes  the  type  system  of 
the  implicitly  typed  first-order  polymorphic  X-calculus  underlying  the  ML 
core.  This  type  system  is  deterministic  in  that  there  is  exactly  one  rule  for 
each  kind  of  expression.  It  was  shown  in  [CDDK86]  to  be  equivalent  to  the 
original  nondeterministic  system  from  [DM82]. 


(TAUT) 


A(x)  >  T 
A  \-  x  :  x 


(APP) 


A  h  e  :  x'  ->x  A  e'  :  z' 
A  |-  (e  e')  :  x 


(ABS) 


A  [x'/jc]  |-  e  :  x 
A  |-  Xx.e  :  x'  — >  x 


(LET) 


A  |-  e  :  x  A  [gen(A,  x)/x]  \-  e'  :  x' 
A  [-  let  x  =  e  in  e'  :  x' 


The  following  auxiliary  definitions  are  needed: 

•  In  the  generic  instantiation  of  a  type  scheme  to  a  type,  each  generic 
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(universally  quantified)  type  variable  is  replaced  by  a  type. 

VoCj  ...a  .x  >  x'  iff  there  are  types  such  that 
x'  =  x [x,/a„  ...,  x  /a  ] 

L  1  1  n  nJ 

•  The  generalization  of  a  type  x  under  an  assumption  set  is  the  type 
scheme  obtained  from  x  by  universally  quantifying  over  those  type 
variables  that  are  free  in  x  but  not  in  the  assumption  set  A. 

gen  (A,  x)  =  V  (FV(%)  \  FV(A ) ) .  x 

For  instance,  Va.  a  — >  a  >  int  — >  int  but  not  Va.  a  — >  a  >  int  — >  real ,  and 
gen{  [p/x] ,  a  — » (3)  =  Va.a— 

Now  we  have  a  type  inference  system  that  defines  what  is  a  valid  typing 
judgment  in  the  ML  core.  However,  we  are  actually  interested  in  an  algo¬ 
rithm  that  tells  us  whether  a  given  (implicitly  typed)  core-ML  expression  is 
type  correct,  and  if  so,  what  its  principal  type  is.  Given  an  assumption  set  A 
and  an  expression  e,  it  returns  W(A,  e)  =  ( S ,  x) ,  where  is  S  a  substitution 
and  x  a  type.  We  want  this  algorithm  to  be  syntactically  sound  and  complete : 

•  Syntactic  soundness:  If  W(A,  e)  =  (S,  x) ,  then  SA  |-  e  :  x  is  a  valid 
typing  judgment,  that  is,  we  can  prove  it  in  the  inference  system. 

•  Syntactic  completeness  and  principal  typing:  Whenever  A  |-  e  :  x, 
then  W(A,  e)  =  ( S ,  x')  terminates  and  x'  >  x  is  a  principal  type  for  e  un¬ 
der  A. 

The  following  algorithm  from  [DM82]  has  the  desired  properties,  as  proved 
in  [Dam85].  m^v(Var..a  .x)  replaces  each  occurrence  of  a.  in  x  with  a 

fresh  type  variable,  and  gen  is  defined  as  in  the  inference  rules. 

W(A,x)  = 

{Icl,  insty  (A  (x) ) ) 
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W(A,  e  e')  = 

let  ( S ,  x)  =  W(A,  e ) 

(S',  x')  =  W(SA,  e') 

P  be  a  fresh  type  variable 
U  =  mgu(S'x,  x'  — ■»  P) 
in  (US'S,  C/p) 

W(A,  Xx.e)  = 

let  P  be  a  fresh  type  variable 
(5.x)  =  W(A[  p/jc],e) 
in  (S,  (Sp)^x) 

1 V(A,  let  x  =  e  in  e')  = 

let  ( S ,  x)  =  W(A,  e ) 

(S',  x')  =  W((SA)  [gen(SA,z)/x],e') 
in  (S'S,x') 

The  function  mgu(xl,'Z2)  computes  a  most  general  unifier  U  for  Xj  and  x0, 

which  is  a  the  most  general  substitution  such  that  Ux^  =  Ux 2,  if  one  exists, 

otherwise  mgu  fails.  The  idea  is  that  we  substitute  actual  types  for  the  fresh 
type  variables  generated  by  applications  of  insty(\/ (Xj.-.a  .x),  and  that 

when  the  algorithm  W  terminates,  we  have  constructed  a  proof  in  our  infer¬ 
ence  system  whose  structure  corresponds  to  the  structure  of  the  expression 
itself. 

2.3.2  Order-Sorted  Unification  for  Haskell 

In  Haskell,  we  have  a  three-level  world  consisting  of  values,  types,  and  type 

classes.  While  types  in  core-ML  are  not  classified1,  Haskell  type  classes 
classify  types  into  partially  ordered  sorts.  This  is  in  contrast  to  those  type 

'Actually,  in  Standard  ML  types  are  classified  in  types  with  and  without  an  equal¬ 
ity  operation  defined  for  them. 
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systems  where  types  themselves  are  partially  ordered,  for  example  the  one 
of  OBJ  [FGJM85].  Order-sorted  unification  [MGS89]  can  be  used  to  obtain 
a  type  reconstruction  algorithm  in  an  order-sorted  type  system  such  as 
Haskell’s;  this  is  described  in  [NS91]. 

An  order-sorted  signature  C  consists  of  three  parts,  a  set  of  sort  symbols, 
a  sort  hierarchy,  and  a  set  of  arity  declarations.  The  sort  hierarchy  is  simply 
an  partial  order  on  the  sorts.  Arity  declarations  are  of  the  form 
X  :  (Yj»  •••»  Y  )  Y>  where  X  is  a  type  constructor  and  y,  y^  ...,  are  sorts.  The 

set  of  order-sorted  type  expressions  is  the  least  set  satisfying  the  following 
two  conditions: 

•  If  x  has  sort  y'  and  y'  <  y  in  the  sort  hierarchy,  then  x  also  has  sort  y. 

•  If  the  type  expressions  Xj,  ....  x  have  sorts  y^  ...,  y  ,  respectively,  and 
X  :  (Yp  ...,  Y  )  y  is  in  the  set  of  arity  declarations,  then  the  application 
X(Xj,  ...,  x  )  of  the  type  constructor  has  sort  y. 


Substitutions  are  defined  as  in  Section  2.3.1,  but  in  addition,  they  must  be 
sort-correct :  If  type  variable  a  has  sort  y,  expressed  by  writing  a  ,  then 

5(a)  must  also  have  sort  y. 


The  following  example  of  a  sort  hierarchy  shows  the  Haskell  numeric 


class  hierarchy: 


RealFloat 
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As  an  example  for  a  set  of  arity  declarations,  consider  the  following  decla¬ 
rations  for  the  type  constructors  pair  and  list: 

pair  :  (£1,  £1)  £1 
pair  :  (Eq,Eq)Eq 
pair  :  (Orel,  Orel )  Orel 
list  :  (£1)  £1 
list  :  (Eq)  Eq 
list  :  (Orel)  Orel 
list  :  (Num)Num 
list  :  (£1)  £1 

Given  a  set  F  of  equations  over  type  expressions  constructed  from  C,  a 
unifier  of  Y  is  a  substitution  9  such  that  0(Xj)  =  0(xo)  for  all  equations 

x  |  =  x0  in  r.  An  order-sorted  signature  is  called  unitary  if  for  all  such  equa¬ 
tion  sets  r  there  is  a  complete  set  of  unifiers  containing  at  most  one  element. 
Since  unitary  signatures  guarantee  principal  types,  we  give  the  following 
conditions  from  [SS85]  to  guarantee  that  a  finite  signature  is  unitary: 

•  Regularity :  Each  type  has  a  least  sort. 

•  Downwarel  completeness:  Any  two  sorts  have  either  no  lower  bound  or 
an  infimum. 

•  Injectivity:  x  :  (Yr  ...,YH)yand  x  :  (Yr  •••>  Yn)  Y  impiy  Yf  =  Y';-  for  aU 

i  =  1 . n. 

•  Subsort  reflection:  x  :  (Y'j»  •••>Y'„)Y'  and  Y ^  Y  imply  X  :  (Yp  •••>Y„)Y 

for  some  Yi  ^  Yi»  •••»  Y  ^  Y  • 

Haskell  imposes  context  conditions  to  guarantee  that  the  signatures  that 
arise  in  Haskell  programs  are  unitary;  this  is  further  discussed  in  Chapter  5. 
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2.4  Semantics 

It  is  often  convenient  to  use  a  clenotational  semantics  to  reason  about  the 
evaluation  of  ^-expressions.  A  denotational  semantics  is  given  in  terms  of 
an  evaluation  function  that  maps  syntactic  terms  to  semantic  values  in  a  se¬ 
mantic  domain.  The  evaluation  function  £[[e]]  P  interprets  an  expression  e 
in  the  environment  p  and  returns  a  value  in  the  domain  V.  An  evaluation  en¬ 
vironment  is  a  finite  mapping  from  identifiers  to  semantic  values.  A  seman¬ 
tic  domain  is  an  algebraic  structure  that  allows  us  to  represent  (semantic) 
values  corresponding  to  the  (syntactic)  entities  in  our  calculus. 

2.4.1  Recursive  Domains 

The  notion  of  domains  goes  back  to  [SS71].  To  illustrate  this  notion,  we  re¬ 
call  that  in  the  untyped  X-calculus  we  start  out  with  the  built-in  constants 
(integers,  Booleans,  etc.)  and  are  able  to  define  functions  over  the  constants. 
We  can  further  define  functions  that  range  over  these  functions  and  so  on. 
This  structure  is  reflected  in  the  definition  of  the  domain  V  that  satisfies  the 
following  isomorphism: 

V=B  +  N+  (V-4V)  +  {wrong  }± 

Here  +  stands  for  the  coalesced  sum,  so  that  all  types  over  V  share  the  same 
least  element  _L.  In  other  words,  V  is  isomorphic  to  the  sum  of  the  Boolean 
values  B ,  the  natural  numbers  N,  the  continuous  functions  from  V  to  V,  and 
a  value  wrong  representing  runtime  type  errors. 

Solutions  of  equations  of  this  kind  can  be  found  in  the  class  of  continu¬ 
ous  functions  over  complete  partial  orders.  A  complete  partial  order  (cpo) 
consists  of  a  set  D  and  a  partial  order  <  on  D  such  that 

•  there  is  a  least  element  ±  in  Z),  and 

•each  increasing  sequence  <  ...  <xn  <  ...  has  a  least  upper  bound 
(tub) 
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A  function  /  is  continuous  iff  it  preserves  lubs  of  increasing  sequences,  that 
is, 

„>„*„>  =  U„a0/i*„) 

An  element  of  a  cpo  is  called  to -finite  iff  whenever  it  is  less  than  the  lub  of 
an  increasing  sequence  it  is  less  than  some  element  in  the  sequence.  Finally, 
a  domain  is  defined  as  a  cpo  satisfying  the  following  conditions: 

•  Consistently  complete :  Any  consistent  subset  of  V  has  a  least  upper 
bound,  where  Xc  Vis  consistent  if  it  has  an  upper  bound  in  V. 

•  (O-a/gebraic:  V  has  countably  many  co-finite  elements,  and  given  any 
v  e  V,  the  set  of  co-finite  elements  less  than  x  is  directed  and  has  x  as 
its  least  upper  bound. 

The  co-finite  elements  in  any  subset  A  of  a  cpo  are  denoted  by  X° . 

Our  domain  V  can  be  constructed  via  a  limiting  process  described  in 
[Smy77], 

2.4.2  Weak  Ideals 

Ideals  [MPS86]  capture  the  notion  of  sets  of  structurally  similar  values  and 
have  proven  to  be  a  useful  model  for  types.  As  a  detailed  treatment  of  the 
ideal  model  goes  beyond  the  scope  of  this  dissertation,  we  confine  ourselves 
to  a  summary  of  the  properties  relevant  to  our  work. 

A  subset  I  of  a  domain  D  is  a  (weak)  ideal  iff  it  satisfies  the  following 
conditions: 

•  1*0, 

•  for  all  ye/  and  x  e  D,  x  < y  implies  xe  /,  and 

•  for  all  increasing  sequences  (x  ),  x{  e  1  for  all  i  >  0  implies  \Ax  e  1. 

Ideals  have  the  pleasant  property  that  they  form  a  complete  lattice  with  their 
greatest  lower  bounds  given  by  set-theoretic  intersection  and  their  least  up- 
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per  bounds  given  by  the  following  formula,  stating  that  finite  lubs  are  given 
by  set-theoretic  union: 

(U  /„)•  =  u(/„°) 

n  n 

The  ideals  over  domain  V  form  a  complete  metric  space,  on  which  a  Ba¬ 
nach  fixed-point  theorem  holds.  This  allows  us  to  model  recursively  defined 
types  as  fixed  points  of  contractive  maps  on  ideals.  The  maps  on  ideals  cor¬ 
responding  to  the  type  constructors  in  our  type  model  (see  Section  3.6.3)  are 

contractive  and  consequently,  our  recursively  defined  algebraic  data  types1 
have  a  well-defined  semantics. 


'Algebraic  data  types  in  our  language  are  a  restricted  version  of  ML  datatypes. 


3  An  Extension  of  ML  with  First- 
Class  Abstract  Types 


This  chapter  presents  a  semantic  extension  of  ML,  where  the  component 
types  of  a  datatype  may  be  existentially  quantified.  We  show  how  datatypes 
over  existential  types  add  significant  flexibility  to  the  language  without  even 
changing  ML  syntax.  We  then  describe  a  deterministic  Damas-Milner  type 
inference  system  [DM82]  [CDDK86]  for  this  language,  which  leads  to  a 
syntactically  sound  and  complete  type  reconstruction  algorithm.  Further¬ 
more,  the  type  system  is  shown  to  be  semantically  sound  with  respect  to  a 
standard  denotational  semantics. 

3.1  Introduction 

In  ML,  datatype  declarations  are  of  the  form 

datatype  [ arg ]  T  =  Kx  of  Xj  |  ...  |  Kn  of  x?; 

where  the  K’ s  are  value  constructors  and  the  optional  prefix  argument  arg 
is  used  for  formal  type  parameters,  which  may  appear  free  in  the  component 
types  x..  The  types  of  the  value  constructor  functions  are  universally  quan¬ 
tified  over  these  type  parameters,  and  no  other  type  variables  may  appear 
free  in  the  x.’s. 
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An  example  for  an  ML  datatype  declaration  is 

datatype  'a  Mytype  =  mycons  of  ' a  *  ( ' a  ->  int) 

Without  altering  the  syntax  of  the  datatype  declaration,  we  now  give  a 
meaning  to  type  variables  that  appear  free  in  the  component  types,  but  do 
not  occur  in  the  type  parameter  list.  We  interpret  such  type  variables  as  ex¬ 
istentially  quantified. 

For  example, 

datatype  Key  =  key  of  'a  *  ('a  ->  int) 

describes  a  datatype  with  one  value  constructor  whose  arguments  are  pairs 
of  a  value  of  type  '  a  and  a  function  from  type  '  a  to  int.  The  question  is 
what  we  can  say  about  '  a.  The  answer  is,  nothing,  except  that  the  value  is 
of  the  same  type  '  a  as  the  function  domain.  To  illustrate  this  further,  the 
type  of  the  expression 

key (3, fn  x  =>  5) 
is  Key,  as  is  the  type  of  the  expression 

key ([1,2,3], length) 

where  length  is  the  built-in  function  on  lists.  Note  that  no  argument  types 
appear  in  the  result  type  of  the  expression.  On  the  other  hand, 

key (3 , length) 

is  not  type-correct,  since  the  type  of  3  is  different  from  the  domain  type  of 

length. 

We  recognize  that  Key  is  an  abstract  type  comprised  by  a  value  of  some 
type  and  an  operation  on  that  type  yielding  an  int.  It  is  important  to  note 
that  values  of  type  Key  are  first-class;  they  may  be  created  dynamically  and 
passed  around  freely  as  function  parameters.  The  two  different  values  of 
type  Key  in  the  previous  examples  may  be  viewed  as  two  different  imple¬ 
mentations  of  the  same  abstract  type. 
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Besides  constructing  values  of  datatypes  with  existential  component 
types,  we  can  decompose  them  using  the  let  construct.  We  impose  the  re¬ 
striction  that  no  type  variable  that  is  existentially  quantified  in  a  let  expres¬ 
sion  appears  in  the  result  type  of  this  expression  or  in  the  type  of  a  global 
identifier.  Analogous  restrictions  hold  for  the  corresponding  open  and  ab- 
stype  constructs  described  in  [CW85]  [MP88]. 

For  example,  assuming  x  is  of  type  Key,  then 

let  val  key(v,f)  =  x  in 
f  v 

end 

has  a  well-defined  meaning,  namely  the  int  result  of  f  applied  to  v.  We 
know  that  this  application  is  type-safe  because  the  pattern  matching  suc¬ 
ceeds,  since  x  was  constructed  using  constructor  key,  and  at  that  time  it  was 
enforced  that  f  can  safely  be  applied  to  v.  On  the  other  hand, 

let  val  key(v,f)  =  x  in 
v 

end 

is  not  type-correct,  since  we  do  not  know  the  type  of  v  statically  and,  con¬ 
sequently,  cannot  assign  a  type  to  the  whole  expression. 

Our  extension  to  ML  allows  us  to  deal  with  existential  types  as  described 
in  [CW85]  [MP88],  with  the  further  improvement  that  decomposed  values 
of  existential  type  are  let-bound  and  may  be  instantiated  polymorphically. 
This  is  illustrated  by  the  following  example, 

datatype  7  a  t  =  k  of  ('a  ->  'b)  *  ('b  ->  int) 
let  val  k(fl,f2)  =  k(fn  x  =>  x,  fn  x  =>  3)  in 
(f 2 (fl  7) , f 2 (fl  true) ) 

end 

which  results  in  (3,3)  .  In  most  previous  work,  the  value  on  the  right-hand 
side  of  the  binding  would  have  to  be  bound  and  decomposed  twice. 
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3.2  Some  Motivating  Examples 

3.2.1  Minimum  over  a  Heterogeneous  List 

Extending  on  the  previous  example,  we  first  show  how  we  construct  heter¬ 
ogeneous  lists  over  different  implementations  of  the  same  abstract  type  and 
define  functions  that  operate  uniformly  on  such  heterogeneous  lists.  A  het¬ 
erogeneous  list  of  values  of  type  Key  could  be  defined  as  follows: 

val  hetlist  =  [key(3,fn  x  =>  x) , 

key ( [ 1 , 2 , 3 , 4 ] , length) , 
key (7 , fn  x  =>  0)  , 

key  (true,  fn  x  =>  if  x  then  1  else  0)  , 
key(12,fn  x  =>  3)] 

The  type  of  hetlist  is  Key  list;  it  is  a  homogeneous  list  of  elements  each 
of  which  could  be  a  different  implementation  of  type  Key.  We  define  the 
function  min,  which  finds  the  minimum  of  a  list  of  Key’s  with  respect  to  the 
integer  value  obtained  by  applying  the  second  component  (the  function)  to 
the  first  component  (the  value). 

fun  min  [x]  =  x 

|  min  ( (key (vl , f 1) ) : : xs)  = 

let  val  key(v2,f2)  =  min  xs  in 
if  fl  vl  <=  f2  v2  then 
key (vl, f 1) 

else 

key (v2 , f2) 

end 

Then  min  hetlist  returns  key  (7 ,  fn  x  =>  0) ,  the  third  element  of  the 
list. 

3.2.2  Stacks  Parametrized  by  Element  Type 

The  preceding  example  involves  a  datatype  with  existential  types  but  with¬ 
out  polymorphic  type  parameters.  As  a  practical  example  involving  both  ex- 
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istential  and  universal  quantification,  we  show  an  abstract  stack  parame¬ 
trized  by  element  type. 


datatype  ' a  Stack  = 
stack  of  {value 
empty 
push 
pop 
top 

is empty 


'b, 

'b, 

'  a  ->  ' b  ->  ' b 
'  b  ->  ' b 
'  b  ->  ' a, 

' b  ->  bool} 


An  on-the-fly  implementation  of  an  int  Stack  in  terms  of  the  built-in  type 
list  can  be  given  as 


stack {value  =  [1,2,3],  empty  =  [] , 

push  =  fn  x  =>  fn  xs  =>  x  : :  xs, 
pop  =  tl,  top  =  hd,  isempty  =  null} 

An  alternative  implementation  of  Stack  could  be  given,  among  others, 
based  on  arrays.  We  provide  a  constructor  for  each  implementation: 

fun  makeList  Stack  xs  =  stack  {value  =  xs,  empty  =  []  , 
push  =  fn  x  =>  fn  xs  =>  x  : :  xs,  pop  =  tl, 
top  =  hd,  isempty  =  null} 


fun  makeArrayStack  xs  =  stack { . . . } 

To  achieve  dynamic  dispatching,  one  can  provide  stack  operations  that  work 
uniformly  across  implementations.  These  “outer”  wrappers  work  by  opening 
the  stack,  applying  the  intended  “inner”  operations,  and  encapsulating  the 
stack  again,  for  example: 

fun  push  a  (stack {value  =  v,  push  =  pu,  empty  =  e, 

pop  =  po,  top  =  t,  isempty  =  i } )  = 
stack{value  =  pu  a  v,  push  =  pu, 
empty  =  e,  pop  =  po, 
top  =  t,  isempty  =  i} 
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Different  implementations  could  then  be  combined  in  a  list  of  stacks,  and  we 
can  uniformly  apply  the  wrapper  push  to  each  element  of  the  list: 

map  (push  8)  [makeList Stack  [2,4,6], 
makeArrayStack  [3,5,7]] 

3.2.3  Squaring  a  Heterogeneous  List  of  Numbers 

The  next  example  shows  that  datatypes  with  existential  component  types 
provide  high  extensibility.  The  following  type  describes  an  abstract  data 
type  consisting  of  a  number  and  a  multiplication  function  that  can  be  used 
on  the  number: 

datatype  Mult  =  mult  of  'a  *  ('a  *  'a  ->  'a) 

We  define  a  function  that  squares  an  abstract  number: 

fun  square (mult (x, f) )  =  mult (f (x, x) , f ) 

Now  we  can  square  each  element  of  a  heterogeneous  list  of  numbers  in  the 
following  fashion: 

map  square 

[mult  (3,  op  *  :  int  *  int  ->  int)  , 
mult (7.5,  op  +  :  real  *  real  ->  real)] 

New  functions  using  the  abstract  type  Mult  can  be  added  easily  without 
modifying  the  previous  definitions.  This  provides  high  extensibility  in  com¬ 
parison  with  the  closure  approach;  see  also  the  example  in  Section  2.1.2.  For 
example,  we  can  add  a  function  cube  and  raise  each  element  of  a  list  to  its 
cube: 


fun  cube  (mult  (x,  f)  )  =  mult  (f  (x,  f  (x,  x)  )  ,  f ) 

map  cube  [mult ( 8 ,  op  *  :  int  *  int  ->  int ) , 
mult ([1,2,3],  op  @) ] 
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3.2.4  Abstract  Binary  Trees  with  Equality 

This  example  shows  that  abstract  data  types  with  binary  operations  can  be 
modeled  conveniently  and  naturally  using  existential  types.  This  is  in  con¬ 
trast  to  the  tree  example  in  Section  2.1.2.  We  start  with  the  following 
datatype  declaration: 

datatype  Tree  =  tree  of  {value  :  'b, 

empty  :  ' b 

eq  :  'b  *  'b  ->  bool, 

left  :  ' b  ->  ' b, 

right  :  '  b  ->  'b, 

join  :  ' b  *  'b  ->  'b> 

Assuming  that  t  has  type  Tree,  we  can  now  check  whether  the  left  and  right 
subtrees  of  t  are  equal: 

let  val  tree{value=v, left=l, right=r , eq=eq, . . . }  =  t 
in 

eq (1  v, r  v) 

end 

As  opposed  to  the  closure  approach,  where  we  would  have  to  convert  both 
subtrees  to  a  common  representation,  we  can  take  advantage  of  the  fact  that 
two  subtrees  of  a  tree  already  have  the  same  representation. 

3.3  Syntax 

3.3.1  Language  Syntax 

Identifiers  x 

Constructors  K 

Expressions  e  ::=  x  I  (e1,e2)  I  e  e'  \  \x.e  I 

let  x  =  e  in  e'  I 

data  Va,  ...a  .  %  in  e  I  K  I  is  K  I 

1  n  ^ 
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let  K  x  =  e  in  e' 

In  addition  to  the  usual  constructs  (identifiers,  applications,  ^-abstractions, 
and  let  expressions),  we  introduce  desugared  versions  of  the  ML  con¬ 
structs  that  deal  with  datatypes.  A  data  declaration  defines  a  new  datatype; 
values  of  this  type  are  created  by  applying  a  constructor  K ,  their  tags  can  be 
inspected  using  an  is  expression,  and  they  can  be  decomposed  by  a  pattern¬ 
matching  let  expression.  Further,  we  require  each  identifier  bound  by  a  X 

or  let  expression  to  be  unique1. 

The  following  example  shows  a  desugared  definition  of  ML’s  list  type 
and  the  associated  length  function;  p  introduces  a  recursive  type  as  de¬ 
scribed  below. 

data  Va.  (pP-Nil  unit  +  Cons  a  x  (3)  in 
let  length  =  fix  X length . Xxs . 

if  (is  Nil  xs) 

0 

(let  Cons  ab  =  xs  in 

+  (length (snd  ab) )  1) 
in 

length (Cons (3 , Cons (7 , Nil ( ) ) ) ) 

3.3.2  Type  Syntax 

Type  variables  a 
Skolem  functions  K 

Types  x  ::=  unit  \  bool  I  a  I  Xj  x  x2  I  x  — >  x'  I 

K^r--T„)  1  X 

Recursive  types  %  +  ...  +Kmr\m  where  K^K-  for 


*Of  course,  one  would  use  a  static,  block-structured  scoping  discipline  in  practice. 
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i*j 

Existential  types  r|  ::=  3a.  p  I  x 
Type  schemes  o  ::=  Va.o  I  x 
Assumptions  a  ::=  a/xlVar..a  -%/K 

Our  type  syntax  includes  recursive  types  %  and  Skolem  type  constructors  k; 
the  latter  are  used  to  type  identifiers  bound  by  a  pattern-matching  let 
whose  type  is  existentially  quantified.  Explicit  existential  types  arise  only 
as  domain  types  of  value  constructors.  Assumption  sets  serve  two  purposes: 
they  map  identifiers  to  type  schemes  and  constructors  to  the  recursive  type 
schemes  they  belong  to.  Thus,  when  we  write  A  ( K ) ,  we  mean  the  o  such 
that  o  =  Vaj  ...a^. ...  +  Kr\  + _  Further,  let  Z  [AT|]  stand  for  sum  type 

contexts  such  as  +  ...  +  Km T|  ,  where  Ki  =  K  and  r\.  =  p  for  some  i. 

3.4  Type  Inference 

3.4.1  Instantiation  and  Generalization  of  Type  Schemes 

Va,...a  .x>x'  iff  there  are  types  x,,  ...x  such  that 
x'  =  x  [x./a., ...,  x  /a  ] 

L  1  1  n  nJ 

3a,... a  .x<x'  iff  there  are  types  x.,  ...x  such  that 

In  J  r  1  n 

x'  =  x  [x./a,, ...,  x  /a  ] 

L  1  1  n  n J 


gen  (A,  x) 


V (FV (x)  \  FV(A)).z 
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skolem  (A,  3y.  ...y  .  x)  =  x  [k.  (a,,  ...a.)  /v]  where  k,...k  are  new 

*  1  *n  '  i  1  k  'i  1  n 

Skolem  type  constructors  such  that 
{Kj,  K;}  nFS(A)  =  0,  and 

{av...,ak}  =  FV(3yy..yn.x)  \FV(A ) 

The  first  three  auxiliary  functions  are  standard.  The  function  skolem  replac¬ 
es  each  existentially  quantified  variable  in  a  type  by  a  unique  type  construc¬ 
tor  whose  actual  arguments  are  those  free  variables  of  the  type  that  are  not 
free  in  the  assumption  set;  this  reflects  the  “maximal”  knowledge  we  have 
about  the  type  represented  by  an  existentially  quantified  type  variable.  In 
addition  to  FV,  the  set  of  free  type  variables  in  a  type  scheme  or  assumption 
set,  we  use  FS,  the  set  of  Skolem  type  constructors  that  occur  in  a  type 
scheme  or  assumption  set. 

3.4.2  Inference  Rules  for  Expressions 

The  first  five  typing  rules  are  essentially  the  same  as  in  [CDDK86]. 

A  (x)  >  x 
A  \-  x  :  x 

A\-  el:x1  A  \-  e2:x2 
A  h  (eve2)  ■■Tlxx2 

A  e  :  x'  ->  x  A  \-  e'  :  x' 

A  j-  e  e'  :  x 

A  [x'/x]  [-  e  :  x 
A  |-  Xx.e  :  x'  — >  x 

A  \-  e  :  x  A  [gen  (A,  x)  / x\  \-  e'  :  x' 

A  (-  let  x  =  e  in  e'  :  x' 


(VAR) 

(PAIR) 

(APPL) 

(ABS) 

(LET) 
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The  new  rules  DATA,  CONS,  TEST,  and  PAT  are  used  to  type  datatype  dec¬ 
larations,  value  constructors,  is  expressions,  and  pattern-matching  let  ex¬ 
pressions,  respectively. 


g  =  Vaj.-.o^p.^Ti  l  +  ...+Kmr\m 

FV (a)  =0  A  [g/Kv  ...,  g/KJ  h  e  :  x 

(DATA)  - — - -  - 

A  |-  data  o  in  e  :  x 


The  DATA  rule  elaborates  a  declaration  of  a  recursive  datatype.  It  checks 
that  the  type  scheme  is  closed  and  types  the  expression  under  the  assumption 
set  extended  with  assumptions  about  the  constructors. 


(CONS) 


A(K)  >pp.Z[*Ti]  I)  [  p  p .  E  [ Kt\ ]  /  p  ]  <x 
A  K  \  x  — >  p(3.E  [iCrj] 


The  CONS  rule  observes  the  fact  that  existential  quantification  in  argument 
position  means  universal  quantification  over  the  whole  function  type;  this  is 
expressed  by  the  second  premise. 


(TEST) 


A  (K)  >  pP.E  [AT)] 

A  I-  is  K  :  (pi|3.  X  [iCrj] )  bool 


The  TEST  rule  ensures  that  is  K  is  applied  only  to  arguments  whose  type 
is  the  same  as  the  result  type  of  constructor  K. 

A  \-  e  \  p.p.£  [AT|]  FS  (x1)  e  FS  (A) 

ATx  a  [ 8en  (A»  skolem  (A,  r\  [|4p.  L  [^p]  /p] ) )  /x]  \-  e'  :  x' 
(rAl)  - 

A  |-  let  K  x  =  e  in  e'  :  x' 

The  last  rule,  PAT,  governs  the  typing  of  pattern-matching  let  expressions. 
It  requires  that  the  expression  e  be  of  the  same  type  as  the  result  type  of  the 
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constructor  K.  The  body  e'  is  typed  under  the  assumption  set  extended  with 
an  assumption  about  the  bound  identifier  x.  By  definition  of  the  function 
skolem,  the  new  Skolem  type  constructors  do  not  appear  in  A;  this  ensures 
that  they  do  not  appear  in  the  type  of  any  identifier  free  in  e'  other  than  x. 
It  is  also  guaranteed  that  the  Skolem  constructors  do  not  appear  in  the  result 
type  t'. 

3.4.3  Relation  to  the  ML  Type  Inference  System 

We  compare  our  system  with  Mini-ML’,  an  extension  of  Mini-ML  with  re¬ 
cursive  datatypes,  but  without  existential  quantification.  Mini-ML’  has  the 
same  syntax  as  our  language.  The  type  inference  system  of  Mini-ML’  con¬ 
sists  of  the  rules  VAR,  PAIR,  APPL,  ABS,  and  LET,  and  the  following  mod¬ 
ified  versions  of  the  remaining  rules1: 


(DATA’) 


g  =  Va^.o^P  .Klxl  +  ...+Kmxm 
FV(g)  =0  A  [g/Kv  ...,  g/KJ  hn 

A  |-  data  o  in  e  :  x 


(CONS’) 


A  (K)  >  qP.Z  [Kx] 

A  \-  K  :  x  ^  (ip.Z  [ Kx ] 


(TEST’) 


A  (K)  >  qP.Z  [Kx] 

A  h  is  K  :  (pp.E[Vx]  )  -4  bool 


(PAT’) 


A  \-  e  :  (tp.E  [AA] 

A  [gen  (A,  x  [(ip.Z  [/fx]  /p] )  /x]  \-  e'  :  x' 
A  [-  let  K  x  =  e  in  e'  :  x' 


'Theoretically,  it  is  sufficient  to  modify  only  the  DATA  rule  to  preclude  that  exis¬ 
tential  quantifiers  arise  in  the  inference  system;  however,  it  is  more  illustrative 
to  present  modified  versions  of  the  CONS,  TEST,  and  PAT  rules  as  well. 
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Theorem  3.1  [Conservative  extension]  For  any  Mini-ML’  expression  e. 
Ah  e:  Ziff  A  hMini.ML.  e  :  x. 

Proof:  By  structural  induction  on  e. 

Corollary  3.2  [Conservative  extension]  Our  type  system  is  a  conservative 
extension  of  the  Mini-ML  type  system  described  in  [CDDK86],  in  the  fol¬ 
lowing  sense:  For  any  Mini-ML  expression  e,  A  \-  e  :  x  iff 

A  ^ Mini-ML  e  :  X' 

Proof:  Follows  immediately  from  Theorem  3.1. 

3.5  Type  Reconstruction 

The  type  reconstruction  algorithm  is  a  straightforward  translation  from  the 
deterministic  typing  rules,  using  a  standard  unification  algorithm  [Rob65] 
[MM82].  We  conjecture  that  its  complexity  is  the  same  as  that  of  algorithm 
W. 

3.5.1  Auxiliary  Functions 

In  our  algorithm,  we  need  to  instantiate  universally  quantified  types  and 
generalize  existentially  quantified  types.  Both  are  handled  in  the  same  way. 

insty  (VoCj  ...a  .x)  =  x  [f^/ ...,  P  /a  ]  where  Pp  ...,  P^  are  fresh 
type  variables 

inst- 1  (3a,  ...a  .  x)  =x[P./a,,...,P  /a  ]  where  P,,  ...,  P  are  fresh 
type  variables 

The  functions  skolem  and  gen  are  the  same  as  in  the  inference  rules,  with 
the  additional  detail  that  skolem  always  creates  fresh  Skolem  type  construc¬ 


tors. 
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3.5.2  Algorithm 

Our  type  reconstruction  function  takes  an  assumption  set  and  an  expression, 
and  it  returns  a  substitution  and  a  type  expression.  There  is  one  case  for  each 
typing  rule. 

TC  (A,  x)  = 

(Id,  insty  (A  (x) ) ) 

TC  (A,  Op  e2) )  = 

let  (Sp  Xj)  =  TC  (A,  <?j) 

(S2,  t2)  =  TC^A.eJ 
in  (S2Sp  S2Tj  x  t2) 

TC(A,e  e')  = 

let  ( S ,  x)  =  TC  (A,  e) 

(S',  x')  =  TC(SA,e ') 

P  be  a  fresh  type  variable 
U  =  mgu  (S' x,  t'  — >  P) 
in  (US'S,  t/p) 

TC(A,lx.e )  = 

let  P  be  a  fresh  type  variable 
(5.x)  =  TC(A[  p/x],e) 
in  (S,  5p  —>  x) 

TC  (A,  let  x  =  e  in  e') 

let  (S,  x)  =  TC  (A,  e) 

(S',  x’)  =  TC  (SA  [gen  (SA,  x)  /x] ,  e') 
in  (S'S,  x') 
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TC  (A,  data  o  in  e)  = 

let  Va1...ai.|ip.^1Ti1  +  ...+ifmTim  =  G  in 

if  FV(a)  =  0  then 

TC(A[a/Kv...,a/Km\,e) 

TC  (A,  K)  = 

let  x  =  insty  (A  ( K ) ) 

(ip. ...  +  Kr\  +  ...  =  x 
in  (Id,  ( inst 3  (t]  [x/p] ) )  — >  x) 

FC(A,  is  K)  = 

let  x  =  inst  y  (A  (K) ) 
in  (Id,  x^  bool) 

TC  (A,  let  K  x  =  e  in  e')  = 

let  (S,  x)  =  TC(A,  e ) 

U  =  mgu(x,  insty(A(K ))) 

(ip. ...  +  Kr\  +  ...  =  Ux 
x  =  skolem  (USA,  ri  [Ux/ P] ) 

(S',  x')  =  TC  (USA  [gen  (USA,  xK)  /x] ,  e') 

in 

if  FS  (x1)  c  FS  (S' USA)  a 

(FS  (xK)  \  FS  (ri  [  Ux/ p] ) )  n  FS  (S'  USA)  =  0 

then  (S'US,x') 

3.5.3  Syntactic  Soundness  and  Completeness  of  Type 
Reconstruction 


Since  any  two  type  schemes  that  differ  only  by  renaming  of  bound  variables 
instantiate  to  the  same  set  of  types,  it  is  convenient  to  treat  them  as  equiva¬ 
lent.  This  expressed  by  the  following  lemma: 
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Lemma  3.3  [Equivalence  under  renaming]  Let  Oj  =  VcCj.-.a  .xQ  and 
o2  =  vpr..pn.  (xtpj/aj,  Pn/aw] ) .  Then  Gj>xiff  a2  >  x  for  any  type 
x. 

Proof:  Follows  immediately  from  the  definition  of  instantiation. 

Lemma  3.4  [Stability  of  >]  If  o  >  x,  then  So  >  St. 

Proof:  By  definition,  o  =  Va,...a  ,x„  and  x  =  x„  [x./a, . x  /a  ]  for 

some  types  x^  ....  x  .  Since  the  a.’s  may  be  renamed  consistently,  we 
assume  that  {oCj,  ...,oc  }  n  (DomS  u  FV(RngS))  =  0.  Consequently, 
So  =  Va,  ...a  . Sxn  >  (STn)  [Sx./a,, ...,  St  /a  ]  =  Sx,  by  definition 

1  n  0  v  0  1  1  n  nJ  J 

of  >  . 

Lemma  3.5  [Stability  of  gen]  Sgen(A,T)  =  gen(SA,  St) . 

Proof:  Follows  from  the  definition  of  gen,  assuming  that 
(FV(x)\  FV(A))  n  (DomS  u  FV(RngS))  =  0. 

Lemma  3.6  [Stability  of  skolem ]  Sskolem(A,r\)  =  skolem(SA,  Sr\) . 

Proof:  Similar,  using  the  definition  of  skolem  and  an  appropriate  renaming. 

Lemma  3.7  [Stability  of  |-  ]  If  A  (-  e  :  x  and  S'  is  a  substitution,  then 
SA  (-  e  :  St  also  holds.  Moreover,  if  there  is  a  proof  tree  for  A  \-  e  :  x  of 
height  n,  then  there  is  also  a  proof  tree  for  SA  |-  e  :  St  of  height  less  or 
equal  to  n. 

Proof:  By  induction  on  the  height  n  of  the  proof  tree  for  A  \-  e  :  x.  We  have 
one  case  for  each  type  inference  rule,  but  include  only  the  nonstandard 


cases. 
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A  I-  data  Va.  ...a  .Ltp.F1p,  +  ...+F  p  in  e  :  x 

1  1  n  1  *1  m  1 m 

The  premise  is  A  [g/K^,  ...,  g/K  ]  \-  e  :  x,  where 
o  =  Vaj  rij  +  ...+  Kmr\  .  Since  FV(o)  =  0,  we  have 

Sg  =  o.  By  the  inductive  assumption, 

S  (A  [G/Ky,  ...,  c f/Kn\ )  \-  e  \  St,  where 

S  (A  [g/Kv  ...,  G/Kn\ )  =  (SA)  [ g/Kx ,  ...,  G/Kn\  .  Finally,  we  apply 
the  DATA  rule  and  obtain  SA  |-  data  o  in  e  :  Sx. 

A  \-  K  :  x  — >  (i(3.Z  [Fp]  and 

A  h  is  K  :  (|ip.Z[Fp] )  -»  bool 

The  claim  follows  from  stability  of  >  under  substitution. 

A  |-  let  K  x  =  e  in  e'  :  x' 

Assuming  that  (3  i  DomS  u  FV(RngS),  we  have 
S(|ip.E[Fp])  =  pp.L[F(Sp)]  and 
S(p[^p.L[Fp]/p])  =  (Sp)  [pp.Z[F(Sp)]/p], 

We  apply  the  inductive  assumption  to  the  first  premise,  obtaining 
SA  \-  e  :  (ip.  E  [K  (Sp)  ]  ,  and  to  the  last  premise,  obtaining 
S  (A  [gen(A,  skolem(A,  r|  [ptJ3.  X  [Fp]  /(3]  ))/x ] )  (-  e'  :  Sx' .  Further, 

S  (A  [gen(A,  skolem(A,  p  [p,p.Z  [Fp]  / 13]  ))/x] )  = 

(SA)  [Sgen(A,  skolem(A,  p  [pP.Z  [Fp]  /J3]  ))/x]  = 

(SA)  [gen(SA,  skolem(SA,  (Sp)  [|J.p.Z  [F(Sp)  ]  /|3]  ))/x]  ,  using 
Lemma  3.5  and  Lemma  3.6. 

Finally,  we  observe  that  FS(x')  e  FS(A)  implies  FS(Sx')  c;  FS(SA),  and 
the  claim  follows  by  applying  the  PAT  rule. 
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Theorem  3.8  [Syntactic  soundness]  If  TC(A,  e)  =  ( S ,  x)  ,  then  SA  \-  e  :  x. 

Proof:  A  straightforward  application  of  Lemma  3.7.  We  show  the  only 
tricky  case: 

TC(A,  let  Kx=e  in  e')  =  ( S’U,x ') 

By  applying  the  inductive  assumption  to  the  first  recursive  call  to  TC, 
we  have  SA  \-  e  :  x.  Since  mgu(x,  insty(A(K )))  succeeds  with  U,  we 

know  that  A(K)  >  Ux,  whence  Ux  is  of  the  form  [ifkZ  [Wr|]  .  By 
Lemma  3.7,  S' USA  \-  e  :  S'Ux.  We  now  apply  the  inductive  assump¬ 
tion  to  the  second  recursive  call  and  get 
S'  ( USA  [gen(USA,  skolem(USA,  r|  [Lx/p]  ))/x ] )  |-  e'  :  x'. 

We  use  Lemma  3.5  and  Lemma  3.6  to  obtain 
( S'USA )  [gen(S'USA,  skolem(S'USA,  S'  (r\[Ux/fi])))/x]  h  e'  :  x', 
where  S'  (r|  [  (Ux)  /[3] )  =  (S'ri)  [S' Ux/$]  .  The  subsequent  if  state¬ 
ment  ensures  that  none  of  the  fresh  Skolem  constructors  escape  the 
scope  of  the  let  expression.  Hence,  the  PAT  rule  applies  and  our 
claim  is  proved. 


Definition  3.1  [Principal  type]  x  is  a  principal  type  of  expression  e  under 
assumption  set  A  if  A  (-  e  :  x  and  whenever  A  \-  e  :  x'  then  there  is  a  sub¬ 
stitution  S  such  that  Sx  =  x'. 

Theorem  3.9  [Syntactic  completeness]  If  SA  (-  e  :  x,  then  TC(A,  e )  suc¬ 
ceeds  with  TC(A,  e)  =  (S,  x)  and  there  is  a  substitution  R  such  that 

SA  =  RSA  and  x  =  Rx. 

Proof:  Analogous  to  the  completeness  proof  given  in  [Dam85].  We  show 
only  the  new  cases: 
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SA  1-  data  Va,  ...a  .  LtB.A/ri  ,  +  ...+  Fri  in  e  :  x 
i  1  n  I  '1  m  m 

Let  o  =  VcCj  ...a  .(O-P.^rij  +  ....  Since  FV(a)  =  0,  we  have  So  =  o. 

/V  ^ 

The  other  premise  is  SA'  |-  e  :  x,  where  A'  =  A  [o/K^,  ...,  o/Kn ]  . 

By  the  inductive  assumption,  TC(A',  e)  =  (S,  t)  ,  where  there  is  a  sub¬ 
stitution  R  such  that  SA'  =  RSA'  and  x  =  Fx.  Hence  TC(A,  data...) 

succeeds  with  (5,  x)  .  Since  SA'  =  RSA',  SA  =  RSA  also  holds, 
whence  R  is  our  desired  substitution. 

SA  I-  K  :  x  — >  p,p.Z  [Ff|] 

It  is  clear  from  the  premises  that  SA(K)  is  of  the  form  o  =  Vcc^ . . .  a  .  p , 
where  p  =  p[3.Z  [FBPj  ...[^.Xq]  ;  further,  FV(o )  =  0,  since  o  must 
have  been  declared  in  a  surrounding  data  expression.  Therefore, 
SA{K)  =  A{K),  and  the  instantiations  insty  and  inst 3  succeed,  such 
that  x  =  (xQ  — >  p)  [a './ a.,  p//p^.]  >  where  the  a' .  and  p' .  are  fresh  type 
variables.  By  definition  of  >  and  <,  there  are  types  Xj,  ...,  %n  and 
x', ...,  x'  such  that  x  pp.E  [Kf\]  =  (xn ->  p)  [x./a.,  x'./p.] . 

1  /t  \J  l  l  J  J 

Finally,  by  choosing/?  =  S+  [x./a' .,  x' ./p' .],  we  have 

*  l  J  J 

Rt  =  x  — >  pfl.Z  [Ff|]  and  RIclA  =  SA,  since 
FV(x)  c  {a'r  ...,  a'n,  p'r  ...,  p'^}  and 

FV(A)n  {a'1,...,a’n,P'1,  ...,py  =  0. 

SA  I-  is  K  :  (pp.Z[FfiJ)  -A  bool 

This  case  is  analogous  to  the  preceding  one. 

SA  |-  let  K  x  =  e  in  e'  :  x' 
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Again,  SA(K)  =  A(K)  must  be  of  the  form  o  =  Va^.a  . p,  where 
p  =  (ip.Z  [ATtIq]  .  Therefore,  insty(a )  =  p  [a'./a.] ,  where  the  a'.  are 
new  variables.  We  apply  the  inductive  assumption  to  the  first  premise, 
SA  (-  e  :  (ip.Z  [Wp]  ,  whence  TC(A,  e)  =  ( S ,  x)  and  there  is  an  R  such 
that  SA  =  RSA  and  Rx  =  |ip.Z[£r|]  =  p(3.E  [^(ti0  [x./a.] )  ] . 

In  the  first  case,  x  =  a,  thus  R  =  [[ifFE  [ K  (r|0  [x-/ a.] )  ]  /a]  .  Further, 
mgu  succeeds  with  U  =  [p  [a1  ./a.]  /a]  and  R  =  [x  /a1.]  U,  there- 

ll  ‘■l 

fore  RU  =  R  and  r\  =  r|0  [a'./a.]  .  Consequently, 

SA  =  RSA  =  RUSA  and  f|  [|i(3.Z  [Kf\]  /(3]  =  R  (p  [ C/x/(3] )  . 

In  the  second  case,  x  =  [K (r|0  [x./a.] )  ] ,  where  Rx.  =  x;..  Therefore, 
mgu  succeeds  with  U  =  [x./a' .] .  Since  the  a' .  occur  only  in  insty(a), 

we  have  Ux  =  x  and  r\  =  r|Q  [x./a.]  .  Further,  SA  =  RSA  =  RUSA 

and  f|  [pp.E[tfn]/p]  =  R  (r]  [f/x/(3] ) . 

In  either  case,  we  can  apply  Lemma  3.7  to  the  last  premise,  obtaining 

( (RUSA)  [ gen(RUSA ,  skolem(RUSA,  R  (p  [Ux/ 13]  )))/*] )  \-  e'  :  x' . 
By  applying  Lemma  3.5  and  Lemma  3.6,  we  get 
R  ( (USA)  [ gen(USA ,  skolem(USA,  r|  [Ux/ p]  ))/x ] )  \-  e'  :  x' . 

Next,  the  inductive  hypothesis  gives  us  TC(A',  <?')  =  (S',  x') ,  where 
A'  =  USA  [gen(USA,  skolem(U SA,  r|  [I/x/p]  ))/x]  ,  and  an  R'  such  that 

RA'  =  R'S'A'  and  x'  =  R'x’.  Then  R' S' USA  =  RUSA  =  RSA  =  SA 
also  holds. 

Finally,  FS(x')  =  FS(R'x')  e  FS(SA)  =  FS(R'S'USA)  implies 

FS(x')  e  FS(S'  USA) .  This,  together  with  the  definition  of  skolem  guar- 
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antees  that  the  if  statement  succeeds.  Consequently,  TC(A,  let ...) 
succeeds  with  (S' US,  x') ,  and  R'  is  the  desired  substitution. 


Corollary  3.10  [Principal  type]  If  TC(A,  e )  =  (S,  x)  ,  then  x  is  a  principal 
type  for  e  under  A. 

3.6  Semantics 

We  give  a  standard  denotational  semantics.  The  evaluation  function  E  maps 
an  expression  e  e  Exp  to  some  semantic  value  v,  in  the  context  of  an  eval¬ 
uation  environment  p  e  Env .  An  evaluation  environment  is  a  partial  map¬ 
ping  from  identifiers  to  semantic  values.  Runtime  type  errors  are  represent¬ 
ed  by  the  special  value  wrong  .  Tagged  values  are  used  to  capture  the  seman¬ 
tics  of  algebraic  data  types. 

We  distinguish  between  the  three  error  situations,  runtime  type  errors 
(wrong),  nontermination,  and  a  mismatch  when  an  attempt  is  made  to  de¬ 
compose  a  tagged  value  whose  tag  does  not  match  the  tag  of  the  destructor. 
Both  nontermination  and  mismatch  are  expressed  by  _L. 

Our  type  inference  system  is  sound  with  respect  to  the  evaluation  func¬ 
tion;  a  well-typed  program  never  evaluates  to  wrong  .  The  formal  proof  for 
semantic  soundness  is  given  below. 

It  should  be  noted  that  we  do  not  commit  ourselves  to  a  strict  or  non- 
strict  evaluation  function.  Therefore,  our  treatment  of  existential  types  ap¬ 
plies  to  languages  with  both  strict  and  non-strict  semantics.  In  either  case, 
appropriate  conditions  would  have  to  be  added  to  the  definition  of  the  eval¬ 
uation  function  for  pair  expressions,  function  applications,  let  expressions, 
and  pattern-matching  let  expressions:  the  strict  evaluation  function  returns 
_L  whenever  a  subexpression  evaluates  to  _L,  while  the  non-strict  evaluation 
function  retains  _L  as  the  value  of  that  subexpression. 
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3.6.1  Semantic  Domain 

Unit  value  U  =  {unit}^ 

Boolean  values  B  =  {false,  true  }  ^ 

Constructor  tags  C 
Semantic  domain 

V=U  +  B+  (V->  V)  +  (VxV)  +  (CxV)  +  {wrong  }  ± 

In  the  latter  definition  of  V,  +  stands  for  the  coalesced  sum,  so  that  all  types 
over  V  share  the  same  _L. 

3.6.2  Semantics  of  Expressions 

The  semantic  function  for  expressions, 

E  :  Exp  — >  Env  — >  V, 
is  defined  as  follows: 

P  =  P  (*) 

El(eve 2)  I  P  =  (Ele^  p,Ele^  p> 

Elee'J  p 

ifis[[e]]peV— >V  then 
{EleJ  p)  (Ele'J  p) 
else  wrong 

ElEx.e^p  =  IveV.E  [e](p[v/jc]) 
ii[[let  x  =  e  in  e'  J  p  = 


Ele'J  (p  [EleJ  p  /x] ) 
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£[[data  o  in  ej  p  = 

Ele] |  p 

£[[£]]  p  =  Xv<=V.(K,v) 

£[is  A'lp  =  Xvg  V.  if  vg  {if}  x  V  then  true  else  false 

E  [[  let  K  x  =  e  in  e'Jp  = 

£Q>']  (p[if£Mpe  {^}  x  V  then 
snd  (£Q>1  P) 
else  . L/x ]) 

3.6.3  Semantics  of  Types 

Following  [MPS86],  we  identify  types  with  weak  ideals  over  the  semantic 
domain  V.  A  type  environment  \\f  e  TEnv  is  a  partial  mapping  from  type 
variables  to  ideals  and  from  Skolem  type  constructors  to  functions  between 
ideals.  The  semantic  interpretation  of  types, 

T  :  TExp  -»  TEnv  -»  3  (V) 

is  defined  as  follows. 


T  [[  unit  J]  V|/  = 

u 

T  [[  bool  J  \|/  = 

B 

T|[al\|/ 

¥(«) 

TIt1xt2J  V|/  = 

Tlx^y  xTflxJ  \\r 

rflx-^x'Jy 

T[x]]y  — >  T[[x']|  \|/ 

Tl  K(Tr  ...,Xn)  ]  \\l 

= 

(Y(K))  (ritijlv, ...,  Tlxjy  ) 
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=  |i(X/G  3  (V). £{*,.}  x  rni.]  (v|/[//(3])) 

T[[Va.o]]y  =  n  Me  3  (V).T|[ct]  (v  [7/oc]  ) 

Ie  91 

rpa.TUv  =  U  A,/e  3(V).r|[TiJ  (\|/[//a]) 

Ie  91 

The  universal  and  existential  quantifications  range  over  the  set  91  cz  3  (V) 

of  all  ideals  that  do  not  contain  wrong .  Note  that  the  sum  in  the  definition 
of  recursive  types  is  actually  a  union,  since  the  constructor  tags  are  assumed 
to  be  distinct.  It  should  also  be  noted  that  our  interpretation  does  not  handle 
ML’s  nonregular,  mutually  recursive  datatypes.  An  adequate  model  can  be 
given  by  extending  the  semantics  described  in  [MPS86]  to  handle  full  ML 
datatypes  [Aba92];  the  machinery  for  this  model  is  given  in  [Plo83].  An  ad¬ 
equate  semantics  can  also  be  found  in  the  PER  model  described  in  [BM92]. 

Theorem  3.11  The  semantic  function  for  types  is  well-defined. 

Proof:  As  in  [MPS86].  We  observe  that 

X.I  e  3  (V). £{£,.}  x  T[[r|.]  (\\r[I/a])  is  always  contractive,  since 

cartesian  product  and  sum  of  ideals  are  contractive;  therefore,  the  fixed 
point  of  such  a  function  exists. 

Lemma  3.12  Let  \|/  be  a  type  environment  such  that  for  every  a  e  Dom\|/, 
wrong  e.  \|/  (a) .  Then  for  every  type  scheme  o,  wrong  e.  T[[a]]  \\f  . 

Proof:  By  structural  induction  on  o. 

Lemma  3.13  [Substitution] 

Tla  [o'/ a]  ]  \|/  =  r|[a]]  (\|/[ria'J  y  /a] )  . 


Proof:  Again,  by  structural  induction  on  o. 
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Definition  3.2  [Semantic  type  judgment]  Let  A  be  an  assumption  set,  e  an 
expression,  and  o  a  type  scheme.  We  define  |=  A  as  meaning  that 

DomA  cz  Domp  and  for  every  v  e  DomA,  p  (x)  e  T[[A(x)]]  \\t  ;  further,  we 
say  A  |=  e  :  G  iff  |=  A  implies  £[[<?]]  p  e  L[[a|  \]/  ;  and  finally, 

A  |=  e  :  o  means  that  for  all  p  e  Env  and  \|/  e  TEnv  we  have  A  |=  e  :  o. 

Theorem  3.14  [Semantic  soundness]  If  A  (-  e  :  x  then  A  |=  e  :  x. 

Proof:  By  induction  on  the  size  of  the  proof  tree  for  A  |-  e  :  x.  We  need  to 
consider  each  of  the  cases  given  by  the  type  inference  rules.  Applying 
the  inductive  assumption  and  the  typing  judgments  from  the  preceding 
steps  in  the  type  derivation,  we  use  the  semantics  of  the  types  of  the 
partial  results  of  the  evaluation.  In  each  of  the  cases  below,  choose  \|/ 
and  p  arbitrarily,  such  that  |=  A.  We  include  only  the  nonstandard 

r  ’  t 

cases.  Lemma  3.13  will  be  used  with  frequency. 

A  1-  data  Va,  ...a  .uBW.ri .  +  ...+  ATri  in  e  :  x 

I  1  n  1  I]  m  <m 

The  premise  in  the  type  derivation  is  A  [a/Ky,  ...,  o/K  \  \-  e  :  x, 
where  a  =  Va,  ...a  .  uB.^.ri ,  +  ...+  K  ri  .  Since  by  definition, 

]=  A  [a/Ky,  ...,  o/Km\  ,  we  can  use  the  inductive  assumption  to  ob- 

r  ’  t 

tain  E  [[data  Va1  ...a^.%  in  e]p=£’[[c]peri[x]]\)/. 

A  |-  K  :  x  ->  q[3.Z  [/ifri] 

The  last  premise  in  the  type  derivation  is  r\  [qP.Z  [^t]]  /^]  <  x,  where 
r]  =  37^  ...y  x.  By  definition  of  instantiation  of  existential  types, 
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x  =  x  [x./y.,  pP.Z  [Wr|] /p]  for  some  types  x,,  ...,  x  . 

J  J  in 

First,  choose  an  arbitrary  ve  F[[x]]  \|/  and  a  finite  a  <  v.  Now, 
r/G  (nx[yyy,fip-s[^ri]/p]  i¥)° 

(mtFp.x^rii/p]  i  (vinx^/^.]))0 
e  U  (T[x  [pp.E|Wr|]/p]]  (v|/  [J/y  ])  )  ° 

JV  —Jn  e3i  J  J 

(  U  FEx  [^ip.Z[^ri]/p]l  (¥[//y]))° 

Jv..,Jn^  J  J 

(FEri  [p,p.Z  [AT|]  /p]  1  \|/ )  °. 

Hence,  v  =  U  {a\  a  finite  and  a  <  v}  g  T[[r|  [pP.Z  [Wr|]  /P]  ]]  \|/  ,  by 
closure  of  ideals  under  limits.  Consequently, 

(K,v)  G  {K}  xTEti  [qp.E[^ti]/p]]V 
e  ...+  {K}  xr[ri  [jip.Z  [Kr\]  / P]  ]  \\r  +  ... 

...+  {K}  xrtril  (V[r^p.Z[^ri]|V/p])  +  ... 
T[[pp.E[/fr|]]]y  . 

Hence  E  [[ iC]]  p  g  T[x  ->  pP.Z  [iCri]  ]  \|/  . 

A  j-  is  K  :  (jip.Z  [/Cri] )  bool 

Choose  an  arbitrary  vg  T[[  ftP-£  [^Hl  ]  \\t  .  Clearly, 

(£Eis  p  )  v  g  B,  whence 

Elis  p  g  Tl  (|ip.E[^ri])  -^booljy  . 

A  |-  let  K  x  =  e  in  e'  :  x' 

We  follow  the  proofs  in  [Dam85]  and  [MPS86].  The  first  premise  in  the 
type  derivation  is  A  \-  e  :  x,  where  x  =  pP.Z  [iCrj]  and 

T(  =  3y1  ...y^.x.  Let  {c^,  =  FV (x)  \  FV (A)  .  Then,  for  arbi- 


62 


Chapter  3  An  Extension  of  ML  with  First-Class  Abstract  Types 


trary  /, . /.eSK,  h  rr  /  ,  •  i  A  holds,  since  none  of  the  a. ’s 

are  free  in  A. 

Let  v  =  £[[<?]]  p;by  the  inductive  assumption, 
ve  T[[x]]  (\)/  [IYa.\)  .Consequently, 

ve  n  TlxJ  [I/a  ]) 

Iv..,IkeX  '  1 

n  I  (y[//a.]) 

7i>  ■■■’IkG  91 
=  ...  + 

{ K }  X  n  jm I  (v|/  [/  / a.,  nu  (y  [//a. ])/£]) 

/j,  ....  Ik  e  91  '  ' 

+  ... . 

First,  consider  the  more  interesting  case,  fst  (v)  =  K.  Then 

snd  (v)  e  n  U 

h’  ■■■’Ike  *JV 

TKt]  (\|/  [I/a.J/y.,  TEx]  (\|/  [/./a.] )  /p]) 

Let  aj,  ...,  a/;,  /z  <  fc,  be  those  variables  among  a|;  a k  that  are  free 
in  x  [x/p]  . 

We  now  choose  a  finite  a  such  that  a  <  snd  (v)  ,  thus 

ae  n  U  (T[[x  [x/p]J  (\|f  [I/a  J/y  ]))°. 

Iv.-,IheXJ1,..;JneX  1  1  J  J 
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By  definition  of  set  union  and  intersection,  there  exist  functions 

A . /„e3(V)^3(V), 

such  that 

ae  n  (T[[t  [x/(3]]|  (\|/ [/./a  /. (7p  )/y.]) )° 

/p  1  i  J  1  n  J 

e  n  T[[t  [x/p]]]  (V|/  [I/a  f- (7p  )/y,]) 

=  n  rfx  [k  (ar  ....aj/y  x/p]]  (f  [7;/a  /./k  ]) 

=  Tl  Var..a/;.x  [K.(ar  ...,a/;)/y.,  x/(3]  ]]  (\|/[^./k.]) 

=  T  [[  ge/7  (A,  .sFo/<?m  (A,  r|  [x/p] ) )  ]]  (\|/  [/Vk.]  )  , 

assuming  that  the  K.’s  are  the  ones  generated  by  skoletn  (A,  r\  [x/(3] )  . 
Since  by  definition  of  skoletn,  none  of  the  ic.’s  are  free  in  A, 

|=p  y  jy/K  j  A  holds  and  we  can  extend  A  and  p,  obtaining 

l=p  [a/x]  >  ¥  [f./K.]  A  [gen  ( A ’  skolem  (A>  B  t^/P]  ))/•*]. 

We  now  apply  the  inductive  assumption  to  the  last  premise, 

A  [ge/7  (A,  skolem  (A,  T|  [x/p] ) )  /  x]  (-  e'  :  X1, 
and  obtain 

(p[«/x])  gTEx!  (vU/k.])  =r[x']\|/, 
since  FS  (x')  e  FS  (A)  .  Finally, 

F[[let  Kx=e  in  e'  ]  p 
Ele'J  (p[snd(£|ej  p)/x]) 

=  U  {F[el  (p  [a/x] )  |  a  finite  and  a  <  snd  (F  [[<?]]  p  )  }  , 

by  the  continuity  of  E.  The  latter  expression  is  in  T  [[x1  ]  \|/  by  the  clo¬ 
sure  of  ideals  under  limits. 
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In  the  second  case,  fst  (v)  ^  K.  For  any  functions  /(,  ...,/  e  3(V)h  — >  3(V) 

we  have  ±  e  T^gen  (A,  skolern  (A,  r|  [x/p] ) )  ]  (\|/  [/./k.]  )  .  Again, 

J  J 

since  none  of  the  K.’s  are  free  in  A,  1=  rr ,  ,  A  holds  and  we  can 

j  lp,viy/K/] 

extend  A  and  p,  obtaining 

Hp  [±/x] ,  v  \f./ k.]  A  [g£>,z  ^A’  (A>  ^  [x/P] ) )  /x]  • 

By  applying  the  inductive  assumption  to  the  last  premise, 

A  [gen  (A,  skolem  (A,  r]  [x/|3] ) )  / x\  )-  e'  :  x', 
we  obtain 

(p[i/x])  erixi  (VI/-/K.])  =  rix'iv . 

This  concludes  our  proof  of  semantic  soundness. 

■ 

Corollary  3.15  [Semantic  soundness]  Let  t|/  be  a  type  environment  such 
that  for  every  a  e  Domt)/,  wrong  g  V|/  (a) .  If  A  [-  e  :  x  and  |=  A,  then 

r  ’  t 

£[[<?]]  p  "A  wrong  . 

Proof:  We  apply  Lemma  3.12  to  Theorem  3.14. 


4  An  Extension  of  ML  with  a 
Dotless  Dot  Notation 


In  this  chapter,  we  describe  a  extension  of  our  language  that  allows  more 
flexible  use  of  existential  types.  Following  notations  used  in  actual  pro¬ 
gramming  languages,  this  extension  assumes  the  same  representation  type 
each  time  a  value  of  existential  type  is  accessed,  provided  that  each  access 
is  through  the  same  identifier.  We  give  a  type  reconstruction  algorithm  and 
show  semantic  soundness  by  translating  into  the  language  from  Chapter  3. 

4.1  Introduction 

MacQueen  [Mac86]  observes  that  the  use  of  existential  types  in  connection 
with  an  elimination  construct  (open,  abstype,  or  our  let)  is  impractical 
in  certain  programming  situations;  often,  the  scope  of  the  elimination  con¬ 
struct  has  to  be  made  so  large  that  some  of  the  benefits  of  abstraction  are 
lost.  In  particular,  the  lowest-level  entities  have  to  be  opened  at  the  outer¬ 
most  level;  these  are  the  traditional  disadvantages  of  block-structured  lan¬ 
guages. 

We  present  an  extension  of  ML  that  provides  the  same  flexibility  as  the 
dot  notation  described  in  [CL90].  In  this  extension,  abstract  types  are  again 
modeled  by  ML  datatypes  with  existentially  quantified  component  types. 
Values  of  abstract  type  are  created  by  applying  a  datatype  constructor  to  a 
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value,  and  they  are  decomposed  in  a  pattern-matching  let  expression. 
However,  we  allow  existentially  quantified  type  variables  to  escape  the 
scope  of  the  identifier  in  whose  type  they  appear,  as  long  as  the  expression 
decomposed  is  an  identifier  and  the  existentially  quantified  type  variables  do 
not  escape  the  scope  of  that  identifier.  Each  decomposition  of  an  identifier, 
using  the  same  constructor,  produces  identical  existentially  quantified  type 
variables.  We  call  our  notation  a  “dotless”  dot  notation,  since  it  uses  decom¬ 
position  by  pattern-matching  instead  of  record  component  selection. 

4.2  Some  Motivating  Examples 

We  assume  the  type  declaration 

datatype  Key  =  key  of  ' a  *  ('a  ->  int) 

in  the  following  examples.  In  the  first  example, 

let  val  x  =  key(3,fn  x  =>  x  +  2)  in 
(let  val  key (_, f)  =  x  in  f  end) 

(let  val  key(v,_)  =  x  in  v  end) 

end 

the  existential  type  variable  in  the  type  of  f  is  the  same  as  the  one  in  the  type 
of  v,  and  the  function  application  produces  a  result  of  type  int.  This  follows 

from  the  fact  that  both  f  and  v  are  bound  by  decomposition  of  the  same1 
identifier,  x.  Consequently,  they  must  hold  the  same  value  and  the  whole  ex¬ 
pression  is  type-correct. 


*We  assume  the  ML  scoping  discipline,  which  uses  let  statements  as  scope 
boundaries;  alternatively,  one  could  require  each  bound  identifier  to  be  unique. 
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In  a  language  with  the  traditional  dot  notation,  for  example  Ada,  abstract 
types  can  be  modeled  as  packages,  and  an  example  corresponding  to  the  pre¬ 
vious  one  would  look  as  follows: 

package  KEY_PKG  is 

type  KEY  is  private; 

X  :  constant  KEY ; 

function  F (X  :  KEY)  return  INTEGER; 
private 

type  KEY  is  INTEGER; 

X  :  constant  KEY  : =  3 ; 
end  KEY_PKG ; 

package  body  KEY_PKG  is 

function  F (X  :  KEY)  return  INTEGER  is 
begin 

return  X  +  2 ; 

end; 

end  KEY_PKG ; 
var  Z  :  INTEGER; 

Z  : =  KEY_PKG . F ( KEY_PKG . X ) ; 

The  components  of  the  abstract  type  KEY_PKG  are  selected  using  the  dot  no¬ 
tation. 

The  following  are  examples  of  incorrect  programs.  For  instance, 

let  val  x  =  key(3,fn  x  =>  x  +  2)  in 
let  val  key(_, f)  =  x  in 
f 

end 


end 
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is  not  type-correct,  since  the  existential  type  variable  in  the  type  of  f  es¬ 
capes  the  scope  of  x.  Neither  is  the  following  program, 

let  val  x  =  key(3,fn  x  =>  x  +  2) 
val  y  =  x 
in 

(let  val  key (_, f)  =  x  in  f  end) 

(let  val  key(v,_)  =  y  in  v  end) 

end 

since  different  identifiers  produce  different  existential  type  variables,  al¬ 
though  they  hold  the  same  values  in  this  case.  As  the  latter  cannot  be  deter¬ 
mined  statically,  we  must  assume  that  the  values  have  different  types.  Sim¬ 
ilarly, 

val  z  =  (3,fn  x  =>  x  +  2) 
let  val  key (_, f)  =  key  z  in 

let  val  key(v,_)  =  key  z  in 
f  v 

end 

end 

is  not  type-correct.  Since  the  expressions  that  are  decomposed  are  not  even 
identifiers,  we  cannot  assume  statically  that  f  can  be  applied  to  v. 

4.3  Syntax 

4.3.1  Language  Syntax 

Syntactically,  our  underlying  formal  language  is  almost  unchanged,  except 
that  pattern-matching  let  expressions  only  allow  an  identifier  to  be  decom¬ 
posed,  not  a  general  expression.  This  is  not  a  significant  restriction,  since 
we  can  always  bind  the  expression  in  an  enclosing  let  before  decomposing 
it.  Again,  we  assume  that  each  identifier  bound  by  a  X  or  let  expression  is 
unique. 


Identifiers 


x 
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Constructors  K 

Type  constructors  T 

Expressions  e  ::=  ()  I  true  I  false  I 

x  \  (ep  e0)  I  e  e'  I  Xx.  e  I 

let  x  =  e  in  e'  I 

data  Va,  ...a  .%  in  e  I  AT  I  is  AT  I 

1  n  A' 

let  K  x  =  x'  in  c1 


4.3.2  Type  Syntax 

Type  variables  a 
Skolem  functions  K 
Types  x 

Recursive  types  % 

Existential  types  T( 
Type  schemes  o 
Assumptions  a 


unit  \  bool  I  a  I  Tj  x  x0  I  x  — >  x'  I 
K  „  .(x,,  x  )  \  X 

+  ...+Kmr\m  where  Ki*Kj  for 

i  *j 

3a. ri  I  x 
Va.o  I  x 

o/x  I  Va,  ...a  -x 

1  n  ^ 


Our  type  syntax  is  almost  unchanged.  However,  Skolem  type  constructors 
are  now  uniquely  associated  with  an  identifier  x  by  using  the  symbol  K,  in¬ 
dexed  by  x,  the  constructor  K  used  in  the  decomposition,  and  the  index  i  of 
the  existentially  quantified  variable  y.  to  be  replaced. 
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4.4  Type  Inference 

4.4.1  Instantiation  and  Generalization  of  Type  Schemes 

Va,...a  ,x>x'  iff  there  are  types  x,,  ...x  such  that 

In  J  r  1  n 

x'  =  x  [x,/a„  x  /a  ] 

L  1  1  n  nJ 

3a,...a  .x<x'  iff  there  are  types  x,,  ...x  such  that 

In  J  r  1  n 

x'  =  x  [x,/a„  x  /a  ] 

L  1  1  n  nJ 

gen  (A,  x)  =  V  (FV  (x)  \  FV  (A) )  .x 

skolem '  (A,  x,  K,  3y  ( . . .  y .  x)  = 

Where 

{av...,ak}  =  FV( 3yr..yH.x)  \FV(A) 

Instantiation  and  generalization  are  unchanged.  The  modified  function 

skolem'  replaces  each  existentially  quantified  variable  in  a  type  by  a  unique 
type  constructor  whose  actual  arguments  are  those  free  variables  of  the  type 
that  are  not  free  in  the  assumption  set.  Since  identifiers  are  unique,  we  ob¬ 
tain  Skolem  constructors  uniquely  associated  with  an  identifier  v  by  using 
the  symbol  K,  indexed  by  x,  the  constructor  K  used  in  the  decomposition, 
and  the  index  i  of  the  existentially  quantified  variable  y  to  be  replaced.  In 

addition  to  FV,  the  set  of  free  type  variables  in  a  type  scheme  or  assumption 
set,  we  use  FS  ,  the  set  of  those  Skolem  type  constructors  that  occur  in  a 

type  scheme  or  assumption  set  and  are  associated  with  identifier  x. 

4.4.2  Inference  Rules  for  Expressions 

The  first  three  typing  rules  are  the  same  as  in  the  original  system. 
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(VAR) 


A(x)  >  x 
A  K  x  :  x 


A  K  el  :  x  A  K  :  x2 

(PAIR) 

A  K  (ev  e2)  :  Xj  xx2 


(APPL) 


A  K  g  :  x'  -»  x  A  K  g'  :  x' 
A  |-‘  e  e'  :  x 


The  ABS'  and  LET  rules  are  modified  to  prevent  Skolem  constructors  asso¬ 
ciated  with  a  bound  variable  to  escape  the  scope  of  that  variable. 

A  [x'/jc]  K  e  :  x  FS  (A)  u  FSr(x)  =  0 

(ABS)  - - - 

A  |-‘  Xx.e  :  x'  — >  x 


(LET) 


A  |-'  r  x 

A  [gen  (A,  x)/x]  I-'  e’  :  x’  FSV(A)  u  FS/r’)  =  0 
A  let  v  =  e  in  e'  :  x' 


The  rules  DATA',  CONS',  TEST  remain  unchanged. 


o  =  Va1...arpp.F1ri1  +  ...+FHriw 

FV (a)  =0  A  [a/Fp  ...,  g/KJ  y  e  :  % 
(DATA')  - -  - 

A  |-'  data  o  in  e  :  x 

(C0NS  A(K)>nP.S  [frill  r|[np.l[frri]/p]  <t 
A  |-  K  :  x  — t  |ip.E  E^Tr|] 
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(TEST) 


A(£)>|ip.E[£ri] 

A  K  is  K  :  (|ip.L[^Ti])  ->• bool 


(PAT) 


A(£)>nP.£[tfn]  A(x)>|lp.E[^Ti] 

A  [gen(A,  skolem'(A,  x,  K,  rj  [jlp.L  |Wr|]  /(3]  ))/x']  j-'  e  :  x 
A  |-'  let  K  x'  -  x  in  e  :  x 


The  new  PAT  rule  does  not  enforce  any  restriction  on  occurrence  of  Skolem 
constructors.  It  only  requires  that  the  variable  x  be  of  the  same  type  as  the 
result  type  of  the  constructor  K.  The  body  e  is  typed  under  the  assumption 
set  extended  with  an  assumption  about  the  bound  identifier  x'. 

4.5  Type  Reconstruction 

Again,  the  type  reconstruction  algorithm  is  a  straightforward  translation 
from  the  deterministic  typing  rules. 

4.5.1  Auxiliary  Functions 

While  insty  and  inst 3  are  as  in  the  preceding  chapter,  the  other  auxiliary 
functions  are  the  same  as  in  the  inference  rules. 

4.5.2  Algorithm 

Our  type  reconstruction  function  takes  an  assumption  set  and  an  expression, 
and  it  returns  a  substitution  and  a  type  expression.  There  is  one  case  for  each 
typing  rule. 

TC(A,x )  = 

(Id,  insty  (A  (x) ) ) 
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TC(A,  Op  e2))  = 

let  (Sp  x:)  =  TC(A, 

(S2,  x2)  =  TC(SxA,eJ 
in  (505p  x  x0) 

TC'(A,  e  e ')  = 

let  (5,x)  =  TC(A,e) 

(S',  x’)  =  TC(SA,e') 

P  be  a  fresh  type  variable 
£/  =  mgw  (S"x,  x'  — >  P) 
in  (US'S,  f/p) 

TC  (A,  lx.  e)  = 

let  P  be  a  fresh  type  variable 

(S,  x)  =  TC(A[ p/jc],e) 

in 

if  F5t.(5A)  u  F5v(x)  =  0  then 

(S,Sp^x) 

FC' (A,  let  x=  e  in  e')  = 

let  (S,  x)  =  TC  (A,  e) 

(S’,  x')  =  TC(SA  [ gen  (SA,  x)  /x] ,  e') 

in 

if  FSx(S'SA)  u  FSx(t')  =  0  then 
(5'5,x') 
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TC'(A,  data  o  in  e)  = 

let  V<x1...ai.nP.£1Ti1  +  •••  +Kmr\m  =  a  in 
if  FV(g)  =  0  then 

TC(A[o/Kv  ...,o/KJ,e) 


TC'(A,  K)  = 

let  x  =  insty  (A  ( K ) ) 

(ip. ...  +  Kr\  +  ...  =  x 
in  (Id,  (inst 3  (T)  [x/p] ))  -»x) 

rC  (A,  is  = 

let  x  =  inst  y  (A  ( K ) ) 
in  (Id,  x  — >  bool) 

TC'(A,  let  K  x'  =  x  in  e')  = 
let  x  =  insty(A(x )) 

U  =  mgu(l,  insty(A(K ))) 

(ip. ...  +ifr|  +  ...  =  f/x 
x  =  skolem  '  (U A,  x,  K,  (ri  [  t/x/p] ) ) 

(S,  x')  =  TC(UA  [gen  (UA,  xK)  /*'] ,  e') 

in 


(St/,  x’) 
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4.5.3  Syntactic  Soundness  and  Completeness  of  Type 
Reconstruction 

Lemma  4.1  [Stability  of  ]  If  A  e  :  x  and  S'  is  a  substitution,  then 

SA  (-'  e  :  St  also  holds.  Moreover,  if  there  is  a  proof  tree  for  A  |-'  e  :  x  of 

height  n,  then  there  is  also  a  proof  tree  for  SA  \-'  e  :  St  of  height  less  or 
equal  to  n. 

Theorem  4.2  [Syntactic  soundness]  If  TC'(A,  e)  =  (S,  t)  ,  then  SA  e  :  x. 
Definition  4.1  [Principal  Type]  x  is  a  principal  type  of  expression  e  under 

assumption  set  A  if  A  |-‘  e  :  x  and  whenever  A  (-'  e  :  x1  then  there  is  a  sub¬ 
stitution  S  such  that  Sx  =  x'. 

Theorem  4.3  [Syntactic  completeness]  If  SA  \-'  e  :  x,  then 

TC'(A,  e)  =  ( S ,  x)  and  there  is  a  substitution  R  such  that  SA  =  RSA  and 

x  =  Rt. 

Corollary  4.4  [Principal  type]  If  TC' {A,  e)  =  ( S ,  x) ,  then  x  is  a  principal 
type  for  e  under  A. 

Proof:  We  modify  the  proofs  given  in  Chapter  3. 

4.6  A  Translation  Semantics 

We  retain  our  original  semantic  interpretation  E  [[  ]] .  Following  [CL90],  we 
prove  semantic  soundness  by  giving  a  type-  and  semantics-preserving  trans¬ 
lation  to  our  original  language.  The  idea  is  that  we  can  enclose  an  expression 
e  with  subexpressions  of  the  form  let  K  x'  =  x  in  e'  by  an  outer  expres- 
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sion  that  defines  x'  and  replace  let  K  x'  =  x  in  e'  by  e' .  That  is,  we  re¬ 
place  e  by 

let  K  xK  =  x  in  e[e'  [xK/x'~\  /let  K  x'  =  x  in  e'] 

We  chose  the  enclosing  let  expression  defining  x'  large  enough  so  that  no 
existentially  quantified  type  variables  arising  through  the  inner  let  expres¬ 
sions  escape  this  outer  definition.  Since  the  ABS'  and  LET  rules  guarantee 
that  no  existentially  quantified  variables  emerging  from  the  decomposition 
of  x  escape  the  scope  of  x,  it  is  safe  to  enclose  the  whole  body  of  the  X  or 
let  expression. 

However,  we  must  be  careful,  since  the  outer  decomposition  in  the  trans¬ 
lation  might  fail,  while  the  inner  decomposition  in  the  original  expression 
might  not  necessarily  have  been  reached;  this  is  possible  if  the  value  held  by 
x  does  not  have  the  constructor  tag  K.  Therefore,  we  need  to  replace  e  by 
an  if  expression  with  branches  for  each  constructor  tag  in  the  datatype  that 
x  has.  This  is  reflected  in  the  definition  of  the  auxiliary  translation  function 
||  ||  below. 

4.6.1  Modified  Original  Language 

Type  judgments  in  a  modified  version  of  the  original  language  are  of  the 
form  A  |-0  e  :  x.  We  modify  the  skolem  function  and  the  PAT  rule  of  our 
original  language: 

skolem °(A,  x,  3y,  ...v  .x)  =  x  [k  .(a,,  ...a.)  /y.]  where 

*  1  *n  L  x,  i  1  k  *r 

{av...,ak}  =  FV(3yy..yn.x)  \FV(A ) 

Unique  Skolem  type  constructors  can  be  generated  by  using  the  symbol  K, 
indexed  by  the  unique  name  x  of  the  bound  identifier  and  the  index  i  of  the 
existentially  quantified  type  variable  y.  to  be  replaced. 
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A  h°  e  :  nP.E  [Fr|]  FS/A)  u  FS/u')  =  0 

A[gen(A,skolem°(A,x,r\[iifi.Z[Kr\]/$]))/x']  h°  e'  :  x' 

(  j  A  h°  let  K  x  =  e  in  Fix7 

Using  this  modified  skolem0  function,  the  PAT°  rule  can  enforce  that  newly 
generated  Skolem  constructors  escape  their  scope  by  the  condition 
FS  (A)  u  FS  (x')  =  0,  which  expresses  that  no  Skolem  constructor  associ¬ 
ated  with  x  may  escape  the  scope  of  x. 

It  is  easy  to  see  that  this  language  has  the  same  properties  as  the  original 
one,  in  particular,  semantic  soundness. 

4.6.2  Auxiliary  Translation  Function 

The  bodies  of  X  and  let  expressions  are  translated  by  the  auxiliary  func¬ 
tion  given  below.  It  moves  all  pattern-matching  let  expressions  that  de¬ 
compose  the  variable  bound  by  the  enclosing  f  or  let  expression  to  the 
outermost  level  possible. 

We  use  a  conformity  check  in  form  of  a  nested  if  expression  with  is  ex¬ 
pressions  to  determine  the  constructor  tag  of  the  value  held  by  x.  This  re¬ 
quires  us  to  evaluate1  x;  consequently,  the  resulting  expression  is  always 
strict  in  x.  Therefore,  this  translation  is  not  semantics-preserving  if  the  orig¬ 
inal  expression  was  non-strict  in  x.  We  need  to  distinguish  between  the 
translation  of  the  strict  and  the  non-strict  version  of  our  language: 

•  In  the  strict  language,  the  expression  bound  to  x  is  already  evaluated 
at  binding  time,  and  evaluating  it  again  leaves  the  semantics  un¬ 
changed. 

•  In  the  non-strict  language,  the  expression  bound  to  x  might  not  be  eval- 


Tt  actually  suffices  to  evaluate  the  argument  to  weak  head  normal  form,  so  that 
the  top-level  constructor  of  the  argument  can  be  inspected;  see  [PJ87]  for  details. 
Nevertheless,  the  resulting  translation  is  not  semantics-preserving. 
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uated  at  all;  to  be  semantics-preserving,  the  translation  must  not  intro¬ 
duce  additional  evaluations  of  x. 

As  described  in  [PJ87],  the  only  patterns  for  which  a  conformity  check  can 
be  omitted  safely  are  the  irrefutable  patterns  involving  datatypes  with  a  sin¬ 
gle  constructor.  We  therefore  restrict  the  non-strict  version  of  our  language 
in  the  following  way: 

Existentially  quantified  type  variables  may  occur  only  in  the  compo¬ 
nent  types  of  datatypes  with  a  single  constructor. 

The  auxiliary  translation  function  for  the  strict  version  of  the  language 
is  defined  as  follows: 

^e^x,  /sT.ri,  +  ...  +K  n  - 

1  1 1  n  'n 


if  is  Ky  x  then 

let  Ky  xK  =  x  in  e 

else  if  is  K 2  x  then 


e'  [ xK  /x']  /let  Ky  x'  =  x  in  e' 
fail /let  K- ^  (  x' =  x  in  e' 


else  if  is  K  x  then 


let  K  x„  =  x  in 

n 


e'  [av  / x']  /let  K  x' =  x  in  e' 

Kn  n 

fail /let  K-.  x' =  x  in  e' 

i  n 


else 

e  [fail/let  Ki  x' =  x  in  e '] 

In  the  non-strict  case,  there  can  be  only  a  single  constructor  with  an  existen¬ 
tial  component  type,  and  the  auxiliary  translation  function  reduces  to: 

1^11  v^ri  =  Kxk  =  x  in  e  [<?'  [xK/x'~\  /let  K  x  =  x  in  e'] 


■Ki  xx  +  . 


+  Kx 


e 
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4.6.3  Inference-guided  Translation 

We  give  a  translation  guided  by  the  type  inference  rules,  along  the  lines  of 
[NS91].  Let  <?q  be  a  closed,  well-typed  term.  The  translation  is  defined  along 

with  the  type  inference  rules  for  each  subterm  of 


(VAR) 


A (x)  >  X 
A  '  x  :  x  =>  x 


A  h  Ci  .  x,  =>  Cj  A  [—  e9  •  x9  e9 

(PAIR)  - - - 1 - - - 

A  I-  (Cj,  e9)  •  Xj  x  x0  — /  (cj,  e9) 

A  l-'rxAx^e  A  c'  :  x1  =>  e' 

(APPL)  —  ^ 

A  K  e  <?'  :  x  =>  e  <?' 

A  [x'/x]  |-’e:x=>£  x'  ^  iiB.Z  [Wn] 

(ABS)  — - — - — — 

A  |-'  Xx.  e  :  x'  — >  x  =>  Xx.  e 

A  [|ap.  L  [iVri]  /x]  K«:x  FS  v(A)  u  FS  p)  =  0 

A  [|ap.Z[^ri]/x]  K  l^fU™  :x^« 

(ABS-’)  - "  1  - - - 

A  \-’Xx.e\  (|i|3.  E  [£t|] )  — >  x  =>  Xx.  e 

A  \-'e:T=>e  x  ±  p,p.  E  [iVri] 

A  [gen  (A,  x)  /x]  |-'  c'  :  x'  =>  e' 


(LET) 


A  |-'  let  x  =  e  in  e'  :  x'  =>  let  x  =  e  in  e' 
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Aye:  |l|3.L  [Ki\]  =>  e  FSX(A)  u  FSx(z')  =  0 

A  [gen  (A,  p(LZ  [AT|] )  / x\  (-'  e'  :  x' 

A  [gen  (A,  (i(3.E  [Kr\]  )/x\  y  Ik'L  ENfnl  :  x’  =>  e’ 

(LET’)  - -  _ - — 

A  |-'  let  x  =  e  in  e'  :  x'  =>  let  x  =  e  in  e' 

<5  =  Vav..an.^.Ky\l  +  ...+Kmx\m 

FV(a )  =0  A[o/Kv  ...,o/K]  ye:x=>e 
(DATA-)  - -  - - - 

A  |-'  data  o  in  e  :  x  =>  data  o  in  e 

rrON,  AQg)  >(lp.Z[gTl]  Tl[^p.Z[^]/p]  <T 
AhA:  x-»nP.Z[^ri] 


(TEST) 


_ A(K)  >VL$.Z[Kr}] _ 

A  K  is  K  :  (nP.L[tfn])  -»  bool  ^  is  K 


A(K)  >  |i(3.E  [^Tri]  A(x)  >  |i(3.E  [Ki\] 

A  [gen(A,  skolem  {A,  x,  K,  rj  [|i(3.L  [^fr|]  /|3]  ))/x']  y  e  :x 

(PAT)  - — - — 

A  |-'  let  K  x'  =  x  in  e  :  x  =>  let  K  x'  =  x  in  e 


4.6.4  Translation  of  Type  Schemes  and  Assumption  Sets 

After  applying  ||  ||  to  the  body  of  a  X  or  let  expression,  the  only  pattern¬ 
matching  let  expressions  left  in  the  body  are  of  the  form 
let  Kxk  =  x  in  e.  In  the  following  translation,  the  Skolem  constructors 

associated  with  x  become  associated  with  xR.  This  is  reflected  by  the  fol¬ 
lowing  translations: 

l°j  =  °yK,/\K,i] 
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LA  J  =  [[a]/x\A(x)  =  a] 

4.6.5  Properties  of  the  Translation 

Lemma  4.5  Let  A  =  A  [VoCj  ...ar  +  ...  +  Kn r)  /jc]  .  If  A  e  :  x, 

then  A  K  lklL,Mi  +  ...+X(i„n:  x. 


Proof:  In  the  strict  case,  let  1  <  i  <  n  be  arbitrary.  We  are  free  to  extend  A, 
assuming  that  xK  is  not  free  in  A,  hence 

A  gen{A,  skolem'{A,  x,  K^x \.))/xK  \-'  e  :  x. 

Then,  any  subexpression  let  Kj  x'  =  x  in  e'  of  e  is  well-typed  and 
we  have  a  subproof  for  A'  |-  let  Kj  x'  =  x  in  e'  :  x',  where 
A\xk)  =  gen(A,  skolem ’ (A ,  x,  K{,  p .)).  A  premise  of  this  judgment  is 

A'  gen(A',  skolem  (A', x,  Ki,r\i))/xK  e'  :  x'.  Therefore, 

A'  h  e'  [xK  /x']  :  x',  since  we  may  drop  the  assumption  about  x'  after 
substituting  xK  for  it. 

By  replacing  the  proof  tree  for  the  let  subexpression  by  this  latter  one 
and  by  observing  that  fail  has  any  type,  we  can  prove 


A  gen(A,  skolem+(A,  x,  Ki,x\.))/xK 

~e'  \xK  /x']  /let  KjX'=x  in  e'~\ 
e  1  :  x 

fail/let  K-.-x'=x  in  e' 

L  J  l  J 

Thus, 

e' [xK /x'] /let  KiX'=x  in  e'~ 

1  :  x 


A  |-'  let  Ki  xK  =  x  in  e 


fail/let  K-  ■  x' =  x  in  e 

J  ^  l 
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Using  a  suitable  typing  for  the  if  expressions,  we  conclude  that 


A  I"'  ^eh,K^l  +  ...+KnX: 


The  claim  for  the  non-strict  case  follows  analogously. 


Theorem  4.6  [Type  preservation]  If  A  \-'  e  :  x  =>  e,  then  LAJ  |-0  e  :  |_T J . 

Proof:  By  structural  induction  on  e.  We  show  the  only  interesting  case,  all 
others  are  straightforward. 

A  |-'  let  K  x'  =  x  in  e  :  x  =>  let  K  x'  =  x  in  e 

Since  our  expression  is  a  subexpression  of  a  well-typed  expression,  x 
is  bound  either  in  a  X  or  in  a  let  expression.  Thus,  it  must  be  a  sub¬ 
expression  of  an  expression  of  the  form  ||  ^  ^ K ^  ,  where 

A'  [Va1  ...a^. p,p.Z  [AT|]  / x]  h'  lie'll^ ^[Kr]]  :  x'  and 

FSfA')  u  FSx(x')  =  0  for  some  A'  and  x'.  By  definition  of  ||  || ,  the 

only  subexpressions  of  lie'll  f°rm  let  K  x'  =  x  in  e 

are  the  branches  of  the  if  expression,  each  of  the  form 
let  K  xK=  x  in  e,  where 

A'  [Vctj  ...a^.jlp.Z  [AT|]  /x]  |-'  let  Kxk  =  x  in  e  :  x'  and 
FSfA')  u  FSx(x')  =  0;  therefore  x  =  x'  and  A  =  A'  in  the  subproof. 
As  a  premise,  we  have 

A  [gen(A,  skolem(A,  x,  K,  r|  [p,p.Z  [AT|]  / 13]  ))/xK\  |-'  e:  x=>e; 
by  the  inductive  assumption, 

h°  i:  Lxj, 


A  [ gen{A ,  skolem  {A,  x,  K,  T|  [p[3.Z  [£Y|]  / 13]  ))/x^] 
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hence 

L A j  [gen(A,  [skolem  +  (A,  x,  K,  T|  [|i().Z  |Wr|]  /(3]  )\)/xK\  j-°  e  :  |_ x J . 
Since 

|_ skolemfA ,  x,  K,  T|  [|i[).Z  |Wr|]  / 13]  )J 
=  skolem°(A,  xK,  T|  [  (if) .  X  f  AT|]  / 13] ) 
we  have 

L  A  J  [ gen(A ,  skolem°(A,  xK,  T)  [|i(].Z  [Wr|]  /(3]  ))/x^]  (-°  ~e  :  LTJ  • 

We  translate  the  other  two  premises  and  obtain  lAj(K)  >  [|iP-Z  J 
and  A  (x)  >  L(i(].Z  [i^ri ]  J .  We  further  observe  that 
I  FSX(A )  I  =  FSX  (LA J)  and  I  FSfx)  I  =  FSX  (LtJ),  thus 

FSX  (  A  )  u  FSx  (LtJ)  =  0. 

Finally,  we  can  apply  the  PAT0  rule  and  conclude  that 
L  A  J  |-0  let  K  xK=  x  in  e:  [tJ. 

■ 

Lemma  4.7  .EfOUp  =£’[[  ||e|L  v  „  ,  ,  v  ]  p  for  arbitrary  p  defined 

x>  Al'li  +  •••  +  An'l„  •  ' 

for  x. 

Proof:  By  definition  of  E ,  any  subexpression  of  e  is  evaluated  in  an  envi¬ 
ronment  p'ap,  whence  p'(x)  =  p(x).  We  identify  two  cases: 

p(x)  e  { KA  x  V  for  some  i.  Then,  in  the  strict  case  only, 

£[[let  K.  -x'=x  in  e'J  p'  =  _L 

and  in  both  cases, 

£[[let  K-x'=x  in  e']p'  =  E  [V  [x^/x']  ]]  (p1  [sndfpCLO/x^  ] ) 
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Consequently,  in  the  strict  case, 
£[L<?11  p 


e'  \XK  Z^']  /let  K- x' =  x  in  e' 

Ele  ]  (p  [snd(p(x))/x^]) 

fail /let  K-  ■  x' =  x  in  e'  1 

L  J  T-  l  J 


e'  \XK  Z^']  /let  K;x'=x  in  e'~ 


E  [[  let  K-  xK  =  x  in  e  1  ]  p 

'  fail /let  K-  ■  x' =  x  in  e' 

L  J  T-  l  J 


since  the  if  branch  for  i  gets  selected. 

In  the  non-strict  case  for  i  =  1 , 

EleJ  P 

=  E  le  \e'  [ xK/x ']  /let  K  x'  =  x  in  e ']  J  (p  [snd(p(.r))/.r^] ) 

=  E  [[let  K  xK  =  x  in  e  [ e '  [xK/x']  /let  K  x'  =  x  in  e']  ]  p 

=  I eWx, ^Tr|]l  P  > 

and  for  i  >  1 , 

^[[e]]  p  =  E  [[  \\e\\  K  +K  ]  p  . 

11  n  n 


p(.r)  £  {Kj}  x  V  for  any  i. 

Then,  in  the  strict  case,  E  [[let  a;'  =  in  e'J  p'  =  _L,  whence 


EleJ  p 

=  E  [[e  [fail /let  K-  x'  =  x  in  e’\  J  p 

=  £[[  \\e\\X'K^  +  "'+Knx\J  P  ’ 

since  the  last  else  branch  gets  selected. 

In  the  non-strict  case  for  i  =  1 , 

£[[let  K  x'  =  x  in  e'Jp' 
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=  Ele  1  (p’  [-L/jc*]) 

=  E  H>'  [ xK/x ']  1  (p1  [l/xz] )  . 

Therefore, 

EleJ  p 

=  E  [[e  [e1  [xK/x']  /let  K  x' =  x  in  e']  ]]  (p  [-L/v^] ) 
=  El\\e\\x  Ky]J  p  . 


Theorem  4.8  [Preservation  of  semantics]  If  A  \-‘  e  :  x  =>  e,  then 
E^e\ |p  =£[£]]  p  for  arbitrary  p . 

Proof:  By  structural  induction  on  e. 

Corollary  4.9  [Semantic  soundness]  If  e  is  a  closed  term,  A  |-‘  e  :  x  and 
|=  A  as  defined  previously,  then  £[[<?]]  p  e  T[[x]]  \)/. 

r  ’  T 

Proof:  Follows  immediately  from  the  two  theorems,  observing  that  [xj  =  x 
and  IA J  =  A,  since  neither  x  nor  A  contain  any  k’s. 


5  An  Extension  of  Haskell  with 
First-Class  Abstract  Types 


This  chapter  introduces  an  extension  of  the  functional  language  Haskell 
with  existential  types.  Existential  types  combine  well  with  the  systematic 
overloading  polymorphism  provided  by  Haskell  type  classes.  Briefly,  we  ex¬ 
tend  Haskell’s  data  declaration  in  a  similar  way  as  the  ML  datatype  decla¬ 
ration  above.  In  Haskell,  it  is  possible  to  specify  what  type  class  a  (univer¬ 
sally  quantified)  type  variable  belongs  to.  In  our  extension,  we  can  do  the 
same  for  existentially  quantified  type  variables.  This  lets  us  use  type  classes 
as  signatures  of  abstract  data  types;  we  can  then  construct  heterogeneous  ag¬ 
gregates  over  a  given  type  class.  A  type  reconstruction  algorithm  is  given, 
and  semantic  soundness  is  shown  by  translating  into  an  extension  of  the  lan¬ 
guage  from  Chapter  3. 

5.1  Introduction 

Haskell  [HPJW+92]  uses  type  classes  as  a  systematic  approach  to  ad-hoc 
polymorphism,  otherwise  known  as  overloading.  Type  classes  capture  com¬ 
mon  sets  of  operations.  A  particular  type  may  be  an  instance  of  a  type  class, 
and  has  an  operation  corresponding  to  each  operation  defined  in  the  type 
class.  Type  classes  may  be  arranged  hierarchically. 
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In  [WB89],  Wadler  and  Blott  called  for  a  closer  exploration  of  the  rela¬ 
tionship  between  type  classes  and  abstract  data  types.  After  an  initial  explo¬ 
ration  described  in  [L091],  we  now  present  an  extension  of  Haskell  with 
datatypes  whose  component  types  may  be  existentially  quantified. 

In  Haskell,  an  algebraic  datatype  declaration  is  of  the  form 

data  [c  =>]  T  av..an  =  K]  tn...tlk  I  ...  I  K  tmV-tmk 

I  m 

It  introduces  a  new  type  constructor  T  with  value  constructors  K^, 

The  optional  context  c  specifies  of  which  type  classes  the  type  variables 
ap  ...,  a  are  instances.  The  constructors  are  used  in  two  ways:  as  functions 

to  construct  values,  and  in  patterns  to  decompose  values  already  construct¬ 
ed.  The  types  of  the  constructors  are  universally  quantified  over  the  type 
variables  flp  ...,  a  ;  no  other  type  variable  may  appear  free  in  the  component 

types  tiy 

We  describe  an  extension  of  Haskell  analogous  to  the  extension  of  ML 
described  above.  Type  variables  that  appear  free  in  the  component  types  are 
interpreted  as  existentially  quantified.  In  addition  to  the  “global”  context  for 
the  universally  quantified  parameters  of  the  type  constructor,  we  introduce 
“local”  contexts  for  each  value  constructor.  The  local  context  specifies  of 
which  type  classes  the  existentially  quantified  type  variables  in  the  compo¬ 
nent  types  are  instances.  The  extended  datatype  declaration  is  of  the  form 

data  [c  =>]T  a^...an  =  [cj  =>1^  ^w--t\k 

I 

I  [c  =>  1  K  t  , . . .  t  , 

'  L  m  1  m  mi  mk 

m 

When  constructing  a  value  using  a  constructor  with  an  existentially  quanti¬ 
fied  component  type,  the  existential  type  variables  instantiate  to  the  actual 
types  of  the  corresponding  function  arguments,  and  we  lose  any  information 
on  the  actual  types.  However,  we  know  that  these  types  are  instances  of  the 
same  type  classes  as  the  corresponding  existential  type  variables.  This 
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means  that  we  have  types  whose  identity  is  unknown  but  which  support  the 
operations  specified  by  their  type  classes.  Therefore  we  regard  type  classes 
as  signatures  of  abstract  types. 

5.2  Some  Motivating  Examples 

5.2.1  Minimum  over  a  Heterogeneous  List 

This  example  is  the  extended  Haskell  version  of  the  example  given  in  Sec¬ 
tion  3.2.1.  We  first  define  a  type  class  Key  defining  the  operation  whatkey 
needed  to  obtain  an  integer  value  from  the  value  to  be  compared. 

class  Key  a  where 

whatkey  : :  a  ->  Int 

We  now  define  a  datatype  KEY  with  a  single  constructor  key.  The  component 
type  of  key  is  the  type  variable  a,  which  is  existentially  quantified  and  is 
required  to  be  an  instance  of  type  class  Key. 

data  KEY  =  (Key  a)  =>  key  a 

We  further  define  several  instances  of  Key  along  with  their  implementations 
of  the  function  whatkey. 

instance  Key  Int  where  whatkey  =  id 

instance  Key  Float  where  whatkey  =  round 

instance  Key  [a]  where  whatkey  =  length 

instance  Key  Bool  where  whatkey  = 

\x  ->  if  x  then  1  else  0 

A  heterogeneous  list  of  values  of  type  KEY  could  be  defined  as  follows: 

hetlist  =  [key  3, key  [1, 2, 3, 4] , key  7, 
key  True, key  12] 
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The  min  function  finds  the  minimum  over  a  list  of  KEY’s  by  decomposing 
the  elements  of  the  list  and  comparing  their  corresponding  integer  values 
obtained  by  applying  whatkey. 

min  [x]  =  x 
min  ((key  vl) :xs)  = 
case  min  xs  of 
key  v2  -> 

if  whatkey  vl  <=  whatkey  v2  then 
key  vl 

else 

key  v2 

Then  min  hetlist  evaluates  to  key  True,  as  this  is  the  element  for  which 
whatkey  returns  the  smallest  number. 

5.2.2  Abstract  Stack  with  Multiple  Implementations 

We  also  give  the  extended  Haskell  version  of  the  stack  example  from  Sec¬ 
tion  3.2.2.  However,  these  stacks  have  a  fixed  element  type,  since  Haskell 
type  classes  cannot  be  parameterized.  An  extension  of  Haskell  with  param¬ 
eterized  type  classes  is  found  in  [CH092];  it  could  in  turn  be  extended  with 
existential  types,  which  would  allow  us  to  have  polymorphic  abstract  stacks. 
An  integer  stack  is  described  by  the  following  type  class: 

class  Stack  a  where 
empty  : :  a 

push  : :  Int  ->  a  ->  a 
pop  : :  a  ->  a 

top  : :  a  ->  Int 

isempty  : :  a  ->  Bool 

To  achieve  abstraction,  we  define  the  corresponding  datatype  of  “encapsu¬ 
lated”  stacks: 


data  STACK  =  (Stack  a)  =>  Stack  a 
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We  define  two  stack  implementations,  one  based  on  a  list  of  integers: 

instance  Stack  [Int]  where 
empty  =  [ ] 

push  =  ( : ) 

pop  =  tail 

top  =  head 

isempty  =  null 

and  one  based  on  an  integer  array: 

maxlndex  : :  Int 

maxlndex  =  100 

data  FixedArray  =  Fixarr  Int  (Array  Int  Int) 

instance  Stack  FixedArray  where 

empty  =  Fixarr  0  (listArray (1 , maxlndex) []) 
push  a  (Fixarr  is)  = 

if  i  >=  maxlndex  then 

error  "stack  size  exceeded" 

else 

Fixarr (i+1) (s  //  [ (i+1)  :=  a]) 

pop (Fixarr  is)  = 
if  i  <=  0  then 

error  "stack  empty" 

else 

Fixarr (i-1)  s 
top (Fixarr  is)  = 
if  i  <=  0  then 

error  "stack  empty" 

else 

s !  i 

isempty (Fixarr  is)  =  i  <=  0 

arrayStack  xs  =  Stack (Fixarr (length  xs) 

(listArray (1 , maxlndex)  xs) 
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As  we  saw  in  Section  3.2.2,  it  is  convenient  to  define  wrapper  functions  that 
apply  the  functions  operating  on  instances  of  the  type  class  Stack  to  an  en¬ 
capsulated  value  of  type  STACK;  these  “outer”  wrappers  open  the  encapsu¬ 
lated  stack,  apply  the  corresponding  “inner”  operations,  and  close  the  stack 
again.  This  provides  dynamic  dispatching  of  operations  across  different  im¬ 
plementations  of  STACK.  The  wrapper  wpush  is  defined  as  follows: 

wpush  a  (Stack  s)  =  Stack (push  a  s) 

We  can  define  the  following  list,  which  is  a  homogeneous  list  of  two  differ¬ 
ent  implementations  of  STACK: 

stackList  =  [Stack ( [1 , 2 , 3]  ::  [Int]), 

arrayStack (  [5, 6, 7]  ::  [Int])] 

Using  the  wrapper  wpush  and  the  built-in  function  map,  we  can  uniformly 
push  an  integer  onto  each  element  of  the  list: 

map  (wpush  8)  stackList 

5.3  Syntax 

The  formal  treatment  of  our  extension  of  Haskell  builds  on  the  article 
[NS91]  by  Nipkow  and  Snelting,  who  are  the  first  to  give  an  accurate  treat¬ 
ment  of  type  inference  in  Haskell.  Our  language  is  an  extension  of  theirs 
with  algebraic  data  types. 

5.3.1  Language  Syntax 


Identifiers 

X 

Constructors 

K 

Type  constructors 

t 

Expressions 

e  ::=  ()  1  true  1  false 

x  \  (fj,  e 2)  I  e  e'  I  Xx.  e  I 
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let  x  =  e  in  e'  I 
K  I  is  K  I  let  K  x  =  e  in  e' 


Declarations 


d  ::=  data  ?  =  V a  ...a  .%  in  el 

'1  V? 

class  y<  Yj,  ....  y  where 

*1  :  ^ay-Xl>  ^ ayXk 

inst  t :  (Yj . Y„)  Y  where 

X1  =  el>  •••’  = 


Programs  p  ::=  d^...d  e 

5.3.2  Type  Syntax 

Type  variables  a 


Skolem  functions  K 


Type  constructors  t 
Types  T 

Recursive  types  % 

Existential  types  T| 
Type  schemes  o 
Assumptions  a 


unit  I  bool  I  a  ,  I  x.  x  t  J  x  -)  x'  I 
y  1  2 

K  I  t{Ty  ...,Xn)  I  % 

(ip.^lTi]  +  ...  +Kmr\m  where  K(*Kj  for 

i  *j 

3ay.ri  I  x 

Va  .  o  I  p  — >  x  I  x 

o/x  I  c/K 


Our  type  syntax  includes  recursive  types  %  and  Skolem  type  constructors  k; 
the  latter  are  used  to  type  identifiers  bound  by  a  pattern-matching  let 
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whose  type  is  existentially  quantified.  Explicit  existential  types  arise  only 
as  domain  types  of  value  constructors.  Further,  let  Z  [AT|]  stand  for  sum 
type  contexts  such  as  Kf\\^  + ...  +  KmX\m,  where  Ki  =  K  and  T|;.  =  T|  for 

some  i.  Our  type  syntax  also  includes  explicit  type  constructors  t;  this 
makes  it  possible  to  extend  the  order-sorted  signature  with  arities  for  user- 
defined  type  constructors. 

5.4  Type  Inference 

5.4.1  Instantiation  and  Generalization  of  Type  Schemes 

Va  ...a  .x  >  x'  iff  there  are  types  x,,  ...x  of  sorts  y.,  ...,  y  ,  re- 

Y,  Y  C  1  n  *1  kn 

'  1  kn 

spectively,  such  that  x'  =  x  Xj/a  ,  ...,  x  /a 

•1  • n _ 

3a  ...a  .x  >  x?  iff  there  are  types  x1?  ...x  of  sorts  y.,  y  ,  re- 
Y,  Y  C  \  n  *1 

1 1  kn 

spectively,  such  that  x'  =  x  Xj/a  ,  ....  x  /a 

In  addition  to  FV,  the  set  of  free  type  variables  in  a  type  scheme  or  assump¬ 
tion  set,  we  use  FS ,  the  set  of  those  Skolem  type  constructors  that  occur  in 
a  type  scheme  or  assumption  set,  and  FT,  the  set  of  defined  type  construc¬ 
tors  in  a  type  scheme. 

5.4.2  Inference  Rules  for  Expressions 

The  first  five  typing  rules  are  the  same  as  in  the  system  described  in  [NS91]. 

Mx)  >c  x 
(A,  C)  h+  x  :  x 


(VAR+) 
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(PAIR+) 


(APPL+) 


(ABS+) 


(LET+) 


(• A,C )  h+^  :  Xj  (A,  C)  h+e2:  x2 
(A,  C )  h+  (ev  e2 )  :  Xj  xx2 


(A,  C)  |-+  e  :  x'  — >  x  (A,  C)  |-+  e'  :  x' 
(A,  C)  h+ee':x 


(A  [x'/jc]  ,  C)  (-+  e  :  x 
(A,  C)  h+  kx.e  :  x'  — >  x 


FV(x)\FV(A)=  {a  . a  } 

*1 


(A,  C)  |-+  <?  :  x  (ATVa  .x/xl,C)  (— +  t?'  :  x' 

*n  _ 

(A,  C)  )-+  let  x  =  e  in  e'  :  x' 


The  new  rules  CONS+,  TEST+,  and  PAT+  are  used  to  type  value  construc¬ 
tors,  is  expressions,  and  pattern-matching  let  expressions,  respectively. 


(CONS+) 


A{K)  >c  r|  ->t(x?;)  r|<cx 
(A,  C)  h+  K  :  x  ->  t(xn) 


The  CONS+  rule  observes  the  fact  that  existential  quantification  in  argument 
position  means  universal  quantification  over  the  whole  function  type;  this  is 
expressed  by  the  second  premise. 


(TEST+) 


A(K)  >c  T|  ->t(x?;) 

(A,  C)  \-+  is  K  :  t(xn)  -A  bool 


The  TEST+  rule  ensures  that  is  K  is  applied  only  to  arguments  whose  type 
is  the  same  as  the  result  type  of  constructor  K. 
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(PAT+) 


A(K)  >c  (3pg  .x)  -»  t( x  )  (A,  C)  h+  e  : 

{ Kp  . . K^}  n  (FS(x')uFS(A))  =0 


rA 

X 

VPs. 

1 

i  =  1 . . .  k 

/x  A 

V 

C  [k.  :  5. 

L  i  i 

\i  =  l...k]  J 

(A,  C)  [-+  let  K  x  =  e  in  e'  :  x' 


The  last  rule,  PAT+,  governs  the  typing  of  pattern-matching  let  expres¬ 
sions.  It  requires  that  the  expression  e  be  of  the  same  type  as  the  result  type 
of  the  constructor  K.  The  body  e'  is  typed  under  the  assumption  set  extended 
with  an  assumption  about  the  bound  identifier  x.  The  new  Skolem  type  con¬ 
structors  must  not  appear  in  A;  this  ensures  that  they  do  not  appear  in  the 
type  of  any  identifier  free  in  e'  other  than  x.  It  is  also  guaranteed  that  the 
Skolem  type  constructors  do  not  appear  in  the  result  type  x'.  The  Skolem 
type  constructors  Kp  . . . ,  replace  the  existentially  quantified  type  vari¬ 
ables  of  sorts  8r  Thus  the  body  of  the  let  expression  is  typed  under 

the  extended  signature  containing  appropriate  arities  for  Kp  K, .  The  pat- 

tern-matching  let  expression  is  monomorphic  in  the  sense  that  the  type  of 
the  bound  variable  x  is  not  generalized.  This  restriction  is  sufficient  to  guar¬ 
antee  a  type-preserving  translation  into  a  target  language  (see  Section 
5.6.5).  The  case  expression  in  Haskell  syntax  corresponds  to  a  nested  if 
with  an  is  and  a  pattern-matching  let  expression  for  each  case. 

5.4.3  Inference  Rules  for  Declarations  and  Programs 

The  rules  for  class  and  instance  declarations,  and  programs  are  the  same  as 

in  [NS91].  We  add  the  DATA+  rule  to  elaborate  a  recursive  datatype  decla¬ 
ration. 
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(CLASS’1-) 


FT{' up  u  ...  u  FT(xk)  c  Dom(C) 


(A,  C)  |-+  class  y<yv...,yn  where 

X\  1  ^ay-Xl>  •■■’Xk  •  ^aY -Xjt  ’ 

(A  [Vay.x./ x-|  i  =  l...k],C[y<yv 


(INST+) 


re  Dom(C) 

(A,  C)  b+  b  :  T;T^ay  )/CCy 


A(x.)  =  Vcyx. 


i  =  l...k 


(A,  C)  [-+  inst  r:  (y  )  y  where  v. 


(A,  C  [r :  (yn)  y] ) 


e-,,  ...,x 


k~  ek 


(PROG+) 


(A,._i,C,._i)  b+  :  (A,C.)  i  =  l...n 

_ C„)  K  e  :  x _ 

(A0,  Cq)  b+  d-y... dne  .  x 


(DATA*) 


o  =  Va  .(xp.^rij  +  ...+Kmi\ 

*n 

FT(<J)  c=  Dom(C)  t  <£  Dom(C) 
(A,  C)  b+  data  t  =  o  : 


f A 

1  1 
< 
81 

1 

i  i 

CO. 

\ 

1 _ 1 

- >  )/  Ki 

*n 

V 

C[t : 

J 

The  DATA-1-  rule  adds  assumptions  about  the  value  constructors  to  the  as¬ 
sumption  set,  and  extends  the  signature  with  an  appropriate  arity  for  the  new 
type  constructor.  Whereas  recursive  datatypes  were  anonymous  in  the  two 
preceding  chapters,  they  are  now  represented  by  named  type  constructors. 
This  is  necessary  since  the  order-sorted  signature  C  may  contain  arity  dec¬ 
larations  for  user-defined  type  constructors.  We  avoid  using  a  separate  type 
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constructor  environment;  therefore,  in  an  assumption  a/K  about  a  value 
constructor,  o  now  is  the  type  scheme  for  K  when  regarded  as  a  function,  as 
opposed  to  the  type  scheme  describing  the  entire  recursive  datatype  K  be¬ 
longs  to. 

5.4.4  Relation  to  the  Haskell  Type  Inference  System 

Theorem  5.1  [Conservative  extension]  Let  Mini-Haskell’  be  an  extension 
of  Mini-Haskell  with  recursive  datatypes  and  a  monomorphic  pattern¬ 
matching  let  expression,  but  without  existential  quantification.  Then,  for 

any  Mini-Haskell’  program  p,  (A,  C)  \-+  p  :  x  iff  (A,  C)  bMH'P  :  T- 
Proof:  By  structural  induction  on  p. 

Corollary  5.2  [Conservative  extension]  Our  type  system  is  a  conservative 
extension  of  the  Mini-Haskell  type  system  described  in  [NS91],  in  the  fol¬ 
lowing  sense:  For  any  Mini-Haskell  program  p,  (A,  C)  | -+  p  :  z  iff 
(A’C)  :  T‘ 

Proof:  Follows  immediately  from  Theorem  5.1. 

5.5  Type  Reconstruction 

The  type  reconstruction  algorithm  is  a  translation  from  the  deterministic 
typing  rules,  using  order-sorted  unification  [SS85][MGS89]  instead  of  stan¬ 
dard  unification. 

5.5.1  Unitary  Signatures  for  Principal  Types 

The  article  [NS91]  describes  several  conditions  necessary  to  guarantee  uni¬ 
tary  signatures,  which  are  sufficient  to  guarantee  principal  types.  First,  to 
make  a  signature  C  regular  and  downward  compete,  we  perform  the  follow¬ 
ing  two  steps  to  obtain  a  new  signature  CR\ 
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•  For  any  two  incomparable  classes  y^  y0  e  Dom(C),  we  introduce  a  new 
class  declaration  class  y<  yr  y0  with  an  empty  where  part  combin¬ 
ing  the  operations  of  y^  and  y0 . 

•  Then,  for  each  type  constructor  with  instance  declarations 

inst  t:  (y,  )  y,  where  ... 

1  In  ‘1 

inst  t :  (y2„)y2  where  ... 
introduce  another  instance  declaration  of  the  form 
inst  t :  (yn  aY21>  •••>Y1„  aY2„)  (Ti  aY2) 

where  y  a  8  is  simply  the  additionally  declared  class  if  y  and  8  are  in¬ 
comparable,  or  otherwise  the  lower  one  in  the  class  hierarchy. 

Note  that  Haskell  uses  multiple  class  assertions  for  type  variables  to  express 
this  conjunction  of  classes. 

Since  regular  signatures  alone  do  not  guarantee  the  existence  of  principal 
types,  we  impose  the  following  two  conditions  on  CR,  which  are  also 

present  in  Haskell: 

•  Injectivity:  A  type  constructor  may  not  be  declared  as  an  instance  of  a 
particular  class  more  than  once  in  the  same  scope 

•  Subsort  reflection:  If  y^  ...,y  are  the  immediate  superclasses  of  8,  a 
declaration  inst  t  :  (8^)8  where  ...  must  be  preceded  by  declara¬ 
tions  inst  t  :  (V  )  y.  where  ...  such  that  8.  is  a  subclass  of  y'.  for 

n  i  J  J 

al  i  =  1 . . .  m  and  j  =  1 . . .  n . 

As  discussed  in  [NS91],  a  Haskell  signature  that  satisfies  these  conditions  is 
unitary. 
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5.5.2  Auxiliary  Functions 

In  our  algorithm,  we  need  to  instantiate  universally  quantified  types  and 
generalize  existentially  quantified  types.  Both  are  handled  in  the  same  way. 


inst v (V ay  ...ay  .x)  =  x 


W\ . 


where  py  ,  ...,  py  are 


fresh  type  variables 


inst3{ 3a  ...a  x) 

*  1  *  n 


=  x 


PY  /aY.-.PY  /aY 

* 1  *1  ' /I 


*  i  n 
fresh  type  variables 


where  P  .....  P  are 

*n  *n 


OSU  c(x,  x') 


the  most  general  unifier  of  x  and  x'  under  order- 
sorted  signature  C 


5.5.3  Algorithm 


Our  type  reconstruction  function  takes  an  assumption  set,  an  order-sorted 
signature,  and  an  expression,  and  it  returns  a  substitution  and  a  type  expres¬ 
sion.  There  is  one  case  for  each  typing  rule. 


TE(A,  C,  x )  = 

(Id,  inst y  ( A  (x) ) ) 


TE(A,  C,  (ev  e2 ))  = 

let  (SvTl)  =  TE(A,  C,  e{) 
(S2,  x2)  =  TE(SXA,  C,  e2) 
in  (S2SV  5'2X1  X  x2) 
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TE(A,  C,  ee')  = 

let 

(S,  x)  =  TE(A,  C,  e ) 

(S',  x')  =  TE(SA,  C,  e ') 

P  be  a  fresh  type  variable 

U=  osuc(S’T,T'  ->p) 

in 

(US'S,  f/p) 

TE(A,  C,  lx.  e )  = 

let 

P  be  a  fresh  type  variable 
(S,  x)  =  TE(A  fp/x] ,  C,  e) 

in 

(S,  5p->x) 

TE(A,  C,  let  x  =  e  in  e')  = 

let 

(S,  x)  =  TE(A,  C,  e) 

(S',  x1)  =  TE(SA  [gen  (SA,  x)  /x] ,  C,  e') 

in 

( S'S,% ') 

TE(A,  C,  K) 

= 

let 

T|  — >  x  =  insty  ( A(K ) ) 

in 

(Id,  ( inst 3  (T())  — >  x) 

TE(A,  C,  is 

K)  = 

let 

r(  ->  x  =  insty  ( A(K ) ) 

in 

(Id,  x  — >  bool) 
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TE(A,  C,  let  K  x  =  e  in  e’)  = 
let  ( S ,  x)  =  TE(A,  C,  e) 

(Bp§  ,x0)  t{ a  )  =  insty(A(K)) 

K  ‘n 

U  =  osu  c(x,  t{ a  )) 

*n 

K | ,  K,  fresh  type  constructors 

XK=  (UX  )  [K./Pg  i  =  1  ...k 
L  i 

C'  =  C[ k.  :  5.1  /  =  1  ...it] 

L  l  l\  J 

{S',  x')  =  TE{USA[xK/x],C',e') 

in 

if  {Kj,  ....  ka_}  n  (FS(S'USA)  uFS(x'))  =  0  then 
{S'US,%') 

TD{A,  C,  data  t  =  o)  = 

let  Va  a  ^.^^  +  ...+0  =o  in 

‘1  ‘n 

if  FV(c )  =  0  a 

t  €  Dom(C)  a  FT{o)  c=  Dom(C) 

then 

M  I" Va  .T|  .r t{ a  )/pi  ->  t(a  )/Ki  i  =  1 . . . m 

L  *n  1 L  *  n  J 

v  C[f:  (^)Q] 

TD{A,  C,  class  Y^Yr--->Yn  where  Xj  :  Va  .Xj,  ....x*  :  Vay. x^)  = 
(A  [Vcyx./x.j;  =  1  ...fc] ,  C  [y<  Yr  •••,Y„]) 


TD{A,  C,  inst  t:  (Y, )  Y  where  x1  =  gp  x^  =  e^)  = 

(A,  c[t :  iyn)y] ) 
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TD{A,C,dl...dn )  = 

let  (A',  Cx)  =  TD(A,  C,  d{)  in 
TD(A,  C',d2...dn ) 

TP(A,C,dl...dne)  = 

let  (A',  C')  =  TD(A,  C,  dl...dn)  in 
TE(A',  C\  e) 

5.5.4  Syntactic  Soundness  and  Completeness  of  Type 
Reconstruction 

Lemma  5.3  [Stability  of  |-+  ]  If  (A,  C)  (-+  e  :  x  and  S'  is  a  substitution, 
then  ( SA ,  C)  (-+  e  :  St  also  holds.  Moreover,  if  there  is  a  proof  tree  for 
(A,  C)  f-+  e  :  x  of  height  n,  then  there  is  also  a  proof  tree  for 
(SA,  C )  \-+  e  :  St  of  height  less  or  equal  to  n. 

Theorem  5.4  [Syntactic  soundness]  If  TC(A,  C,  e )  =  (S,  x) ,  then 
(SA,  C )  h+  e  :  T. 

Definition  5.1  [Principal  type]  x  is  a  principal  type  of  expression  e  under 
assumption  set  A  and  signature  C  if  (A,  C)  \-+  e  :  T  and  whenever 
(A,  C)  [-+  e  :  x'  then  there  is  a  substitution  S  such  that  St  =  x'. 

Theorem  5.5  [Syntactic  completeness]  If  (SA,  C)  \-  e  :  T,  then 

TC(A,  C,  e)  =  (S,  x)  and  there  is  a  substitution  R  such  that  SA  =  RSA  and 

x  =  Rt. 

Proof:  We  extend  Nipkow’s  recent  work  on  type  classes  and  order-sorted 
unification  and  extend  it  with  existential  types. 
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We  assume  that  the  signature  built  from  the  global  class  and  inst 
declarations  is  unitary.  Clearly,  the  extended  signature  used  to  type  the 
body  of  a  pattern-matching  let  expression  is  also  unitary,  since  the 
Skolem  type  constructors  K;.  are  unique,  and  each  K.  appears  in  only 

one  arity  declaration.  The  latter  trivially  guarantees  injectivity  and 
subsort  reflection. 


5.6  Semantics 

As  in  [NS91]  [WB89],  we  give  an  inference-guided  translation  to  the  target 
language,  an  enhanced  version  of  our  extension  of  ML  with  existential  types 
described  in  Chapter  3.  Type  classes  and  instances  are  replaced  by  ( method ) 
dictionaries ,  which  contain  all  the  operations  associated  with  a  particular  in¬ 
stance  of  a  type  class.  The  translation  rules  are  of  the  form 

(A,  C)  [— +  e  :  x  =>  e  and  mean  “in  the  context  (A,  C) ,  e  is  assigned  type  x 
and  translates  to  £.” 

5.6.1  Target  Language 

Our  extension  of  Mini-Haskell  is  translated  into  an  extended  version  of  the 
language  presented  in  Chapter  3.  As  a  generalization  of  pair  types,  the  lan¬ 
guage  contains  all  n- ary  product  types  ajX.-.xa  with  expressions 

(gj . en)  and  projection  functions  7t”  of  type  a1x...xaj-ta..  The 

PAIR  rule  is  superseded  by  the  TUPLE  rule: 


(TUPLE) 


AhejiXj  ...  A\-en:xn 
A  \-(ev...,en)  :  ^x.-.x^ 


Semantically,  expressions  of  the  form 
let  K(x j,  . .. ,x  )  =  e  in  e' 
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are  regarded  as  short  forms  for  nested  let  expressions  of  the  form 

let  K  z  =  e  in 

let  x,  =  k"z,  •■■,xn  =  71  ”z  in  <?' 

and  are  typed  by  following  PAT”  rule: 


(PAT”) 


Ah  e  :  hP.L[*ti]  r|  =  3^  x  ...  x  x ; 

x' :  x  ...  x  z' n  =  skolem  (A,  p)  FS(x')  c  FS(A ) 
Oj  =  gen(A,  x'j)  ...  ctr  =  gen(A,x'n) 

A  [cl/xl,  ...,an/xn]  h+  e'  :  x' 

A  |-  let  F(xj,  =  e  in  e'  :  x' 


This  rule  is  semantically  sound,  since  the  translation  of  the  short  form  to  the 
full  form  is  type-preserving:  an  application  of  the  PAT”  rule  is  replaced  by 
an  application  of  the  PAT  rule  followed  by  n  successive  applications  of  the 
LET  rule,  using  appropriate  typings  for  the  tuple  projections. 

5.6.2  Dictionaries  and  Translation  of  Types 

We  call  the  translated  types  “ML-types”  to  distinguish  them  from  the  origi¬ 
nal  ones.  ML-types  introduce  a  method  dictionary  for  each  sorted  type  vari¬ 
able  in  the  original  type;  each  sorted  type  variable  is  then  replaced  by  an  or¬ 
dinary  type  variable. 

A  class  declaration 

(A,  C)  |-+  class  y<yv...,yn  where  Xj  :  Va  .Xj, ...,  xk  :  Vcy  xk 

introduces  a  new  ML-type  for  method  dictionaries  of  this  class, 
y(a)  =  Xj  [a/ay]  x  ...  xx^  [a/ay]  x  y^a)  x  ...  x  yn(a) 
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where  the  type  parameter  a  stands  for  the  type  of  the  instance.  The  first  k 
components  of  y(a)  are  the  operations  corresponding  to  class  y.  The  next  n 
components  are  the  dictionaries  for  all  immediate  superclasses  y^  ...,y  of 

y.  Note  that  the  dictionary  Cl( a)  for  the  top  of  the  class  hierarchy  is  the  emp¬ 
ty  product  type. 

Instead  of  defining  y(a)  directly,  the  dictionary  type  is  defined  in  terms 
of  access  functions  Jty,  ...,  xk  to  extract  operations  from  a  dictionary,  and  ac¬ 
cess  functions  y.  . y  to  extract  the  dictionaries  for  the  immediate  super- 

y  ny 

classes  from  a  dictionary. 

Coercion  functions  are  needed  to  convert  a  dictionary  a  of  a  class  y  into 
a  dictionary  of  a  superclass  of  y;  they  are  defined  the  same  way  as  in  [NS91]: 

f  ay  if  y  =  y' 

cast  c(ar  y )  -  |  ^  cast c(a^  8))  if  y  <  8  a  y'  e  super c(8) 

If  there  is  more  than  one  path  from  y  to  y'  with  respect  to  >c,  cast  c  chooses 

an  arbitrary  fixed  path.  The  immediate  superclasses  of  a  class  y  are  defined 
as: 

super c(y)  =  {y'|  y<  y' a -i38.y<  8  <  y'} 

The  method  dictionary  for  an  instance  x  of  a  class  y  within  the  signature  C 
is  defined  as 

diet  c(ay„y)  =  cast  c(a^,  y) 

dictc(t{xv  ...,  x?;),  y)  =  yt(dictc(xv  y:))  ...  ( diet  fix  n ,  yn))  if  according 
to  C,  t(Xj,  ...,  T  )  is  sort-correct  with  resulting 
sort  y 
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Note  that  on  the  left  hand  side,  a  ,  is  a  type  variable,  and  on  the  right  hand 

side,  an  identifier  that  stands  for  a  dictionary  of  class  y'.  As  we  will  see  be¬ 
low,  the  function  y  is  the  dictionary  defined  for  the  type  constructor  t  in  the 

translation  of  the  corresponding  instance  declaration.  Its  arguments  are  the 
dictionaries  for  the  actual  type  parameters  in  an  application  of  t. 

We  define  the  translation  function  for  type  schemes  as  follows: 

ML(x/a  .t)  =  Va  .y.fa,)  -> ...  ->  y  (a  )  ->  xTaya  ,...,a/a 

v  Y  n  *lv  V  nJ  1  y,  n  y 

'n  L  1 1  *nJ 

where  each  a  uniquely  maps  to  an  a..  For  existential  component  types  of 

7 i  1 

user-defined  datatypes,  the  corresponding  ML-types  need  to  include  the  dic¬ 
tionaries  for  the  existentially  quantified  type  variables.  This  reflects  the  way 
operations  on  existential  types  are  explicitly  included  in  the  datatype  com¬ 
ponents  in  Chapter  3. 

MLO py-t)  =  aPj.trp  /ps . p /pslx5  (P  )x...x5  (P  ) 

k  L  1  kJ 

The  resulting  translation  function  for  user-defined  recursive  type  schemes  is 
ML(Va“.pp.Z[^ri])  =  Va“.pp.Z  [K  ML{x\)]  [a/a  , ...,  a  /a 

Note  that  the  dictionaries  for  the  universally  quantified  type  variables  are 
not  included  in  the  component  types,  as  they  are  determined  by  the  actual 
instance  types  substituted  for  the  type  variables.  Since  user-defined 
datatypes  are  anonymous  in  the  target  language,  the  translation  function  for 
the  type  of  a  value  constructor  K  is  given  by  the  entire  recursive  type 
scheme  to  which  K  belongs: 

ML(A(K ))  =  ML(VoT.pp.Z[/7ri])  where  A(K)  =  VoT.r| 

* n  *n 

The  function  ML  extends  to  assumption  sets  as  follows: 
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ML(A )  =  {ML(A(x))/x\x  e  DomA} 

5.6.3  Translation  Rules  for  Declarations  and  Programs 

The  first  three  translation  rules  concern  class  declarations,  instance  declara¬ 
tions,  and  programs.  They  are  the  same  as  in  [NS91].  Declarations  translate 
to  let  expressions  without  bodies. 


(CLASS’1-) 


FT(' Uj)  u  ...  u  FT(xk)  e  Dom(C) 


(A,  C)  \-+  class  y<yv...,yn  where 

xl  :  'v,ay-xi . xk  :  ^aYXk  : 

(A  [Va  x. ,/x\i  =  l...k],C[y<yv  .... y„] )  => 


let  Xi  =  7t 


_  ~k  +  n 


_  —k- 1-  n 


_  —k  +  n 


_  —k  +  n 


-”xk=%k"‘>yi=%k+i . y„=%Un 


(INST+) 


super c( y)  =  {y1,  ..., 


t  e  Dom(C) 


A(x;.)  =  Vcyx. 


(A,  C)  h+  et :  x.rt(ay  )/ay" 


i  =  1 . . .  k 


(A,  C)  )-+ inst  f  :  (y  )  y  where  xl  =  e]_,...,xk 


' k  * 


(A,  C[r :  (yn)  y] )  => 
let  y  ='ka  {ev...,ek, 

*n 

{yl  cast  ({a  y\)  ...cast  c{a  ,  y’)) , 

‘1  *n 


(PROG+) 


(A;-_  p  C ■  _  , )  \-+  di  :  (A.,  C.)  =>  dj  i  =  1  ...n 

_ h+g:  _ 

d^...dne  :  x  =>  cl\  in  ...  in  dn  in  e 
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(DATA*) 


a=V<x  nP.tfjBPg  .x1  +  ...+JS:m3(3§  .x 

{n  1  k,  mk 

1  m 

FT(o)  e  Dom(C)  t  £  Dom(C) 

(A,  C )  |-+  data  t  =  a  : 


A  Va  .  (3p§  .  ij  [ t( a  )/(31 )  ->  t{ a  )/Ki 

*n  ik.  L  'n  J  *n 

L  i 

C[t  :  (y~n) 

=>  data  Va  .  (a.|3 . 


i  =  1 . . .  m 


x 


*i3Piv'ti[“/<V  VV 

8,i0,1)x...xSlt|0lti) 


+  ...  + 


^,3(3  ,  .x  [a./a  .[3  ./(3c.  lx 

m  rmk  m  i  y  rmi  ro  . 

m  L  J  mjJ 

X  •••  X  . ^mkj 


m  m 


The  DATA+  rule  translates  a  data  declaration  with  order-sorted  type  vari¬ 
ables  to  a  data  declaration  in  the  target  language.  The  component  types  of 
the  translated  datatype  consist  of  the  original  component  types  together  with 
the  dictionaries  for  the  existentially  quantified  type  variables.  This  is  re¬ 
flected  in  the  CONS+  and  PAT+  rules  below. 

5.6.4  Translation  Rules  for  Expressions 

The  first  five  translation  rules  are  identical  to  the  ones  in  [NS91]. 


A{x )  =  Va  .x 


(A,  C)  |-+  x  :  xTx, /a , . x  /a 

v  7  1  1  y,  n  y 

L  1 1  *  wJ 

(x  dictc(Tv  Y  x)...  diet  (Ax  n,  y?;)) 


(VAR+) 
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(PAIR+) 


(APPL+) 


(ABS+) 


(A,  C)  h+  el  '■  ~e\  0L  O  h+  e2  '■  x2  =*  ~e2 
(A,  C)  h+  (eve2)  :x,xx2=>  i~el,~e2) 

(A,  C)  |— +  e  x'  — >  x  =>  e  (A,  C)  [-+  e'  :  t'  =>  e' 
(A,  C)  [— +  e  e'  :  x  =>  £  e' 


(A  [x'/jc]  ,  C)  (— +  c  :  x  =>  £ 
(A,  C)  )-+  Lx:,  e  :  x'  — >  x  =>  Lx.  £ 


(A,  C)  FV(x)\FV(A)=  {a  . a  } 

*1  *  W 


(LET+) 


(A  I” Va  .  x/xl ,  C)  |-+  L  :  x'  =>  e' 

*n 

(A,  C )  )-+  let  x  =  e  in  e'  :  x'  => 


let  x  =  Xa..  . 


e  in  e 


In  the  LET+  rule,  the  translation  e  of  the  expression  to  be  bound  to  x  may 
contain  free  dictionary  variables  corresponding  to  the  generic  type  variables 
in  x;  X-bindings  for  those  dictionaries  need  to  be  provided. 

New  translation  rules  are  added  for  value  constructors,  is  expressions, 
and  pattern-matching  let  expressions. 


(CONS+) 


A(K)  =  Va  .  (3(3§  .x)  -» t{ a  ) 

*n  k  *n 

(A,  C)  h  +K\  (x  ->  t(a~))  [x./ a  ,  x-/Pg 

>n  L  '  h  J  ° 

(X. x.K ( x ,  diet c(Xp  Sj),  ...,  diet c(xk,  8,)) ) 


The  translation  of  a  value  constructor  is  again  a  value  constructor;  this  trans¬ 
lated  constructor  packs  dictionaries  for  the  types  substituted  for  the  existen¬ 
tially  quantified  type  variables  together  with  the  value. 
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(TEST+) 


MK)  >c  r\-^t(xn) 

(A,  C )  )-+  is  K  :  -A  bool  =>  is  K 


The  is  K  expression  is  needed  to  examine  the  constructor  tag  and  trans¬ 
lates  to  itself. 


(PAT+) 


A(K)  >c  (3(3^.  x)  -»  t(xn) 

(A,  C)  h+  e  :  t(V)  =>  e 
{Kv...,Kk}  n  (FS(x')uFS(A))  =0 


"A 

X 

VPs 

l 

i  =  1 . . .  k 

/x  ^ 

V 

C  [k.  :  8 

L  l  l 

\i=l...k]  ) 

(A,  C)  )-+  let  K  x  =  e  in  e'  :  x'  => 
let  K(x,8  ,  ...,8  )  =  e  in  e' 

Ki  Kk 


To  translate  a  pattern-matching  let  expression,  we  need  to  look  at  the  way 
value  constructors  are  translated.  We  need  to  provide  a  binding  for  the  orig¬ 
inal  bound  variable  x,  which  corresponds  to  the  first  component  of  the  en¬ 
capsulated  value;  we  further  need  to  retrieve  the  dictionaries  for  the  Skolem 
type  constructors  from  the  remaining  k  components  of  the  encapsulated  val¬ 
ue  and  bind  them  to  variables  8  ,...,8  .  Any  of  these  bound  variables  may 

Ki  Kk 

occur  in  e' ,  the  translation  of  the  body  of  the  let  expression. 

Since  e,  the  translation  of  the  expression  to  be  bound,  is  used  monomor- 
phically,  no  X-bindings  for  potentially  free  dictionary  variables  need  to  be 
provided. 
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5.6.5  Properties  of  the  Translation 

As  in  [NS91],  the  correctness  of  our  translation  scheme  depends  on  the  con¬ 
dition  that  each  instance  declaration  lists  exactly  the  same  operations 
ip  x  as  the  corresponding  class  declaration,  and  in  the  same  order.  This 

scheme  prohibits  redefinition  of  operations  listed  in  superclasses. 

Furthermore,  when  translating  a  program  d 1  ...dfie,  we  require  the  signa¬ 
ture  resulting  from  elaborating  d^...dn  to  satisfy  the  injectivity  and  subsort 

reflection  conditions  stated  in  Section  5.5.1. 

The  next  lemma  says  that  the  translation  can  only  introduce  free  dictio¬ 
nary  variables  in  an  expression  and  is  needed  in  the  main  theorem. 

Lemma  5.6  [Free  variables]  If  (A,  C)  [-+  e  :  x  =>  e,  then  FV(e)  2  FV(e), 
and  FV(e)  \  FV(e)  contains  only  dictionary  variables. 

Proof:  By  structural  induction  on  e .  Free  dictionary  variables  are  explicitly 
generated  in  the  VAR+  and  CONS+  rules.  The  variables  that  are  bound 

in  the  LET+  rule  are  also  dictionary  variables  and,  by  definition,  are  not 
free  in  the  original  expression. 

Lemma  5.7  [Free  variables]  If  (A,  C)  |-+  e  :  x  =>  e  and 
(A1,  C')  \-+  e'  :  x  =>  e' ,  then 

(FV(~e)  u  FV(e'))  \  (FV(e)  u  FV(e’))  =  (FV(e)  \  FV(e))  u  ( FV(e ')  \  FV(e’))  . 

Proof:  Using  Lemma  5.6  and  the  fact  that  FV(e)  and  FV(e')  do  not  contain 
any  dictionary  variables. 
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Lemma  5.8  [Types  of  dictionaries]  Let  d^...d  e  be  a  program  with  transla¬ 
tion  d\  in. ..in  dn  in  e.  Let  C  be  the  unitary  signature  obtained  by 
elaborating  d^...dn,  and  A  a  superset  of  the  assumption  set  obtained  by  elab¬ 
orating  d\  in...  in  dn.  The  following  type  judgments  hold  for  occurrences 
of  cast  and  diet  in  e: 

A[y(a)/ay]  j-  castc{ oyy')  :  Y(a) 

A  y.(a.)/a  i  =  1  ...n  |-  diet c(x,  y)  :  y(x  a./a  i  =  1  ...n  ) 

Proof:  d\  in  ...in  dn  in  is  a  nested  let  expression  without  body;  it  re¬ 
sults  in  assumptions  for  superclass  dictionary  access  functions  y.  and 

f 

instance  dictionaries  of  t  for  y  and  its  superclasses.  The  claim  follows 
from  the  definitions  of  cast  and  diet  and  the  conditions  on  C. 

The  following,  main  theorem  of  this  section  states  that  a  well-typed  ex¬ 
pression  in  the  original  type  system  translates  to  an  expression  that  is  well 
typed  in  the  type  system  of  the  target  language. 

Theorem  5.9  [Type  preservation]  If  (A,  C)  |-+  e  :  x  =>  e  and 
(FV(e)  \  FV{e))  u  (FV(x)  \  FV(A ))  =  {a  ,  ...,  a  }  ,  then 

*  1  *n 

ML(A)\ y.fa.l/a, ,  ...,  y  (a  )/a  1  (-  ~e  :  xTa./a  ,  ...,  a  /a  1 ,  where 
v  /  \ \\  y  y  ’  9  lnK  Y  1  Yi  n  Y 

L  4 1  4nJ  L  41  4nJ 

{ap  ...,a7}  nFV(ML(A ))  =  0. 

Proof:  We  first  observe  that  the  translation  rules  from  Section  5.6.3  exactly 
implement  the  type  translation  scheme  from  Section  5.6.2  by  produc¬ 
ing  the  corresponding  let  expressions  without  bodies.  We  then  con¬ 
tinue  by  structural  induction  on  the  expression  e ,  going  through  each 


case  in  turn: 
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(A,C)  h+  x  :  xk/a  x  /a  1^  (x  diet c(xvyl)... diet ^x  ,  y  ) ) 

‘1  *n  J 

The  premise  of  this  judgment  is  A(x )  =  Va  .  x,  whence 

•n 

(ML(A))(x)  =  Va  .yAa,)  — > > y  (a  )  — ^xTaVa  ,...,a/a  ”. 

v  v  // v  /  n  *lv  l7  (hv  1  y  ’  n  y 

L  '1  *wJ 

Observing  that  FV(clict c(i y.))  =  FV(x.),  let 

}  =  FV(x1)u...uFV(x  J  =  FV(x  rto^x,,  y. )...). 

•  1  ‘m  1  1  11 

By  Lemma  5.8  and  extending  the  assumption  set  with  assumptions  for 

all  of  the  a'  where  necessary,  we  have  for  1  <  i  <  n 

v 

ML{A)  T y'  (ex' .)/ a'  i  =  1  ...ml  h  diet c(x.,y.)  : 

•i  J 

y.fx.ra'Va'  i  =  l...ml) 

;  1  Y| 

By  extending  the  assumption  set  again  and  using  the  TAUT  rule,  we 
also  have 

ML(A)  y' ,(a' .)/a'  i  =  l...ml  |-  x  : 

z  z 

(y,(a  )  -> ...  -> y  (a  )  ->  x)  \x.\a\/a\,  i  =  l...ml/a 
uv  Y/  *«v  Y„  L  ;L  1  yi  J  y!_ 

We  apply  the  APPL  rule  n  times  and  obtain 
ML(A)  y'.(a'.)/a'y  /  =  1...ot  |-  (x  r/zct^Xj,  y^... dietc(xn,  yn))  : 
(xTxVa,, ,  ...,  x  /a  1)  i  =  l...m 

L  1  V  *  yJ  L  1  yi  J 

(A,  C)  |-+  e  e'  :  x  =>  e  e' 

The  premises  of  this  judgment  according  to  the  translation  rules  are 
(A,  C)  |-+  e  :  x  — >  x'  =>  e  and  (A,  C)  |-+  e'  :  x'  =>  e' .  Let 
(FV(e  ~e')  \  FV(e  e'))  u  ( (FV(x)  u  FV(x'))  \  FV(A))  =  {a  ,  ...,  a  }. 

*1  *ra 
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Using  Lemma  5.7,  we  choosing  a  suitable  numbering  such  that 
(FV(e)  \  FV(e))  u  (FV(x'  ->  x)  \  FV(A ))  =  {a  . a  }  and 

'i  <i 

{FV{e')  \  FV(e'))  u  (FV(x’)  \  FV(A))  =  {a  where  h  <  l. 

*h  +  1  *  m 

By  the  induction  assumption,  the  following  two  judgments  hold: 


ML(A ) 
ML(A ) 


X<ai)/0t7 


Ti(ai)/a 


i  =  1  .../" 

h  « 

:  (x’ 

->x) 

a./ a 

1  7, 

i  =  1.../" 

/  =  /?+!. 

, . .  m 

h  e 

’  :  x' 

a./ a 

1  7; 

i  =  /z  +  1 . . 

Since  we  can  extend  the  assumption  sets  and  substitutions  in  both  judg¬ 
ments  for  variables  that  do  not  occur  free,  we  obtain: 


ML(A ) 
ML(A ) 


y.(a.)/ay  i  =  l...m 


a./ a 

i  =  1  . . .  m 

L  '  7; 

y.(a.)/ay  i  =  l...m  (-  e'  :  x' 


a  ./a  i  =  \ ...  m  . 
I  T  ; 


Our  claim  follows  by  applying  the  APPL  rule  and  eliminating  the  su¬ 
perfluous  variables 

{a  ...  a  }  \  ( (FV(~e  ~e’)  \  FV(e  e'))  u  (FV(x)  \  FV(A)) )  from  the 

‘1  * m 

assumption  set. 


(A,  C)  j-"*"  (cj,  ^2)  •  x:  x  x0  =>  (cj,  ^2) 


(A,  C)  j-+  Ajc.  e  :  x'  — >  x  =>  Xx.  e 

These  two  cases  are  handled  in  a  similar  way  as  the  previous  one. 

(A,  C)  |-+  let  x  =  e  in  e'  :  x'  =>  let  x  =  Xtt  .  e  in  e' 

* n 

Let 

(FV(e  ~e')  \  FV(e  e'))  u  ( (FV(x)  u  FV(x'))  \  FV(A))  =  {a  ,  ...,  a  }. 

‘1  *ra 

Using  Lemma  5.7,  we  choosing  a  suitable  numbering  such  that 
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(FV(e)  \  FV(e))  u  (FV(z)  \  FV(A ))  =  {a  . a  }  and 

'i  ' i 

(FV(e')  \  FV{e'))  u  (FV(x’)  \  FV(A ))  =  {a  where 

*  /?  +  1  *  m 

n  <h  <1  <  m.  By  the  inductive  assumption, 

ML(A )  y.(a.)/a  i=  1... I  \-  e  :  x  a./a  i  =  1  ...l  ,  and  after  n  ap- 
_  1  1  ‘i  J  L  '  "/  J 

plications  of  the  ABS  rule,  we  obtain 
ML{A)  [  y.(a  .)/a  i  =  n+l...l 

1  1  ‘i 

I-  Xa  .e  :  yAaA  — > ...  -»y  (a  )  — >  xTa./a  i  =  n  +  l.../ 

1  Y  1/  n  *  y. 

L  _ 

We  apply  the  inductive  assumption  to  the  last  premise,  observing  that 
FV(A)  =  FV(A  rVcT.x/vl): 

*n 

ML(A  [Va  .x/xl)ry.(a.)/cx  i  =  h  +  l...m 

-  ‘n  J  _  1  1  ‘i 

I-  e'  :  x1  [a  ./a  ,  i  =  h  +  1 . . .  m 
'  i  y. 

L  4 1  J 

Finally,  we  extend  the  assumption  sets  of  this  and  the  preceding  judg¬ 
ment  to  include  a  ,  ...,  a  and  are  ready  to  apply  the  LET  rule. 

* n +  1  ‘m 

(A,  C)  j-+  K  :  (x  — >  r(cx~ ))  [x/a  ,  VP§1  => 

L  L  ‘i  J  j 

(kx.K (x,  diet (,(x | ,  8j),  ...,  diet c(xk,  8^)) ) 

Let  ML(A(K))  =  Va;;.  |i(3.  E  [Wr|]  =  Va?;.%,  where 

ri  =  3j3,.xr<x  /a  (3  /(3§1  x8J(P1)x  ...  x8,((3,),  and  let 

L  H  J  j  J 

FL(x1)u...uFL(x  )uFV(x1)u...uFV(xjk)  =  }. 

*1  'm 
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Furthermore,  let  x'  =  x\x./a  ,x-/Bs  . 

We  first  apply  the  CONS  rule  to  derive 

ML(A)\  y'(a')/a'  . y  («'  )/a'  ,  xTa'./a'  1  /jcH  |—  iC  : 

'  ;  1  r  r  y.  1  mv  m  y  ;  y.  1 

L  *1  'ra  L  *iJ  J 

(x,x81Ct1)x...x8jk(<fcjk)-»x[x/a-])  [a’./a’  " 

and  the  TUPLE  rule  together  with  Lemma  5.7  to  derive 
ML(A)  ry,  (a\ )/a'  (a'  )/a'  ,  xTa'./a'  ~|/jb 

v  /  <  P  1/  y  ’  *  my  my  Y  z  y. 

L  '1  L  oJ  J 

b  (x,  diet c(xvdl),  diet c(xk,dk))  : 

(x'xSjCcjJx...  x8,(Ty))  I” a'./ a' 

for  the  argument  supplied  to  K.  We  then  use  the  APPL  rule  to  derive 

ML(A)  [ y'  ,(a')/a'  ,  ...,y  (a'  )/a'  ,  xTa'./a'  I/*- 
v  7  1  lv  V  y.  1  m  m  y  ;  y. 

L  ‘1  *ra  L  wJ  J 

b  K  ( x ,  dietc(xv  Sj),  diet c(xk,  8AJ)  :  (x  [x./a.] )  [ a' ./a' 

and  finally  the  ABS  rule,  which  gives  us 
ML(A) ry^a'^/a'  ,  ...,y  (a'  )/a' 

1 1  *m_ 

b  {Xx.K (x,  diet c(Xp  8j),  ...,  diet c(xk,  Sp) )  : 

(x'-»x[va,-]) 

J  J  I  j 

(A,  C )  b+  is  ^  :  t(X;7)  — >  bool  =>  is  F 

This  case  is  a  straightforward  application  of  the  TEST  rule  in  the  target 
language. 

(A,  C )  b+  let  K  x  =  e  in  e'  :  x'  =>  let  K(x,b  ,  ...,  8  )  =  e  in  e' 

Ki  H 

Let 

(FV(~e  e')  \  FV(e  e'))  u  ( (FV(r(x  ))  u  FV(x'))  \  FV(A))  =  {a  a  } 

*1  *m 
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choosing  a  suitable  numbering  such  that 

(FV(~e)  \  FV(e))  u  (FV(t(x))  \  FV(A ))  =  {a  ,  ...,  a  }  and 

1  h  h 

( FV(e ')  \  FV{e'))  u  (FV(x’)  \  FV(A ))  =  {a  ,  ...,  a  }  ,  where 


Let  Va'H.|ip.E  [£30^. x]  =  ML(a),  where 

P/P5. . Pt/P5,lx5l(Pl)X-X  W' 


T  —  T 

1  '  1  °i  '  *  ' 

By  the  induction  assumption, 
ML{A) 


a. /a  i  =  1  ...l  ,  where 
;  y.  ’ 


Y.(a)/a  1 -e:t 

•i  J 

t  =  (lip.E  [K3$,  .x] )  [x./a1  ,  x  / a'  1 .  We  further  apply  the  in- 
*  a  Ti  n 


duction  assumption  to  the  last  premise  to  obtain 


Y,(a)/<xT 

i  =  h  +  l ... m 

ML{A) 

X 

-  J  J- 

/x,  8j(k  )/8  . 8^/8 

1  K. 

h  e'  : 

X' 

a./a 

L  '  T- 

i  =  h  +  1 

. . .  m 

Note  that  diet K„  8.)  =  8  in  e'.  We  now  extend  the  assumption  sets 
J  J 

of  this  and  the  preceding  judgment  to  include  a  a  ,  and  apply 

?i  tm 

the  PAT”  rule.  Our  claim  follows  after  restricting  the  final  assumption 
set  to  (FV(e  ~e')  \  FV(e  e’))  u  (FV(x')  \  FV(A))  . 


The  following  corollary  is  a  deterministic  version  of  the  type  preserva¬ 
tion  theorem  in  [NS91].  It  covers  the  case  of  unambiguous  resulting  expres¬ 
sions,  that  is,  expressions  whose  translations  do  not  contain  free  dictionary 
variables  not  free  in  their  types. 
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Corollary  5.10  [Type  preservation]  Let  all  types  in  the  range  of  A  be  closed. 
If  (A,  C)  h+  e  :  x  =>  e  and  FV(e)  \  FV(e)  c  FV(x),  then 
ML(A)  h  la  .e  :  y^cXj)  ->  ...  ->y  (a  ) 

* n 

{ar  ...,an}  nFV(ML(A ))  =  0. 

Proof:  We  use  the  preceding  theorem  and  apply  the  ABS  rule  from 
Chapter  3  n  successive  times. 

Corollary  5.11  [Semantic  soundness]  Let  all  types  in  the  range  of  A  be 
closed,  and  let  \|/  be  a  type  environment  such  that  for  every  a  e  Dom\)/, 

wrong  €  \|/(a).  If  (A,  C)  \-+  e  :  x  e  and  |=  ML(A),  then 

r  ’  t 

El~eJ  p  *  wrong  . 

Proof:  By  type  preservation  and  semantic  soundness  of  the  target  language. 


a, /a  , ...,  a  /a 
1  y  ’  n  y 


,  where 


6  Related  Work,  Future  Work,  and 
Conclusions 


6.1  Related  Work 

The  following  table  compares  our  work  with  other  programming  languages 
with  similar  features  or  objectives.  The  design  criteria  used  as  a  basis  for 
our  comparison  are  taken  from  Section  1.1: 

1.  Strong  and  static  typing, 

2.  type  reconstruction, 

3.  higher-order  functions, 

4.  parametric  polymorphism, 

5.  extensible  abstract  types  with  multiple  implementations,  and 

6.  first-class  abstract  types. 

In  the  table,  means  the  feature  is  supported,  O  means  it  is  not  fully  sup¬ 
ported,  and  a  blank  entry  means  it  is  not  supported  at  all. 
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Design  Criterion 

Language 

1. 

2. 

3. 

4. 

5. 

6. 

Our  work 

✓ 

✓ 

✓ 

✓ 

✓ 

✓ 

ML/Haskell 

✓ 

✓ 

✓ 

✓ 

O 

o 

SOL 

✓ 

✓ 

✓ 

✓ 

✓ 

Hope+C 

O 

✓ 

✓ 

O 

o 

✓ 

XML+ 

✓ 

✓ 

✓ 

✓ 

✓ 

Dynamics 

o 

✓ 

✓ 

✓ 

o 

o 

OOL 

o 

o 

o 

✓ 

6.1.1  SOL 

SOL  is  based  on  the  full  second-order  polymorphic  ?i-calculus.  It  is  not 
known  whether  there  is  a  type  reconstruction  algorithm  for  this  language. 

6.1.2  Hope+C 

The  only  other  work  known  to  us  that  deals  with  Damas-Milner-style  type 
reconstruction  for  existential  types  is  [Per90].  However,  the  typing  rules 
given  there  are  not  sufficient  to  guarantee  the  absence  of  runtime  type  errors, 
even  though  the  Hope+C  compiler  seems  to  impose  sufficient  restrictions. 
The  following  unsafe  program,  here  given  in  ML  syntax,  is  well-typed  ac¬ 
cording  to  the  typing  rules,  but  rejected  by  the  compiler: 

datatype  T  =  K  of  '  '  a 

fun  f  x  =  let  val  K  z  =  x  in  z  end 

f (K  1)  =  f (K  true) 
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In  addition,  an  identifier  bound  in  a  pattern-matching  let  expression  is  not 
polymorphic  according  to  the  typing  rules.  This  restriction  does  not  apply  to 
our  work. 

6.1.3  XML+ 

The  possibility  of  making  ML  structures  first-class  by  implicitly  hiding  their 
type  components  is  discussed  in  [MMM91]  without  addressing  the  issue  of 
type  inference.  By  hiding  the  type  components  of  a  structure,  its  type  is  im¬ 
plicitly  coerced  from  a  strong  sum  type  to  an  existential  type.  Detailed  dis¬ 
cussions  of  sum  types  can  be  found  in  [Mac86]  [MH88]. 

6.1.4  Dynamics  in  ML 

An  extension  of  ML  with  objects  that  carry  dynamic  type  information  is  de¬ 
scribed  in  [LM91].  A  dynamic  is  a  pair  consisting  of  a  value  and  the  type  of 
the  value.  Such  an  object  is  constructed  from  a  value  by  applying  the  con¬ 
structor  dynamic.  The  object  can  then  be  dynamically  coerced  by  pattern 
matching  on  both  the  value  and  the  runtime  type.  Existential  types  are  used 
to  match  dynamic  values  against  dynamic  patterns  with  incomplete  type  in¬ 
formation.  Dynamics  are  useful  for  typing  functions  such  as  eval.  Howev¬ 
er,  they  do  not  provide  type  abstraction,  since  they  give  access  to  the  type 
of  an  object  at  runtime.  It  seems  possible  to  combine  their  system  with  ours, 
extending  their  existential  patterns  to  existential  types.  We  are  currently  in¬ 
vestigating  this  point. 

6.1.5  Object-Oriented  Languages 

Most  statically  typed  object-oriented  languages  identify  subclassing  with 

subtyping  (C++  [Str86],  Modula-3  [CDG+89])  at  the  expense  of  severely  re¬ 
stricting  the  expressive  power  of  the  language.  Due  to  the  contravariance 
rule  for  function  subtyping,  not  even  simple  algebraic  structures  can  be  de¬ 
scribed  in  C++;  this  is  discussed  in  detail  in  [CHC90]  [HL91]. 
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Recognizing  and  attempting  to  overcome  this  restriction,  other  languag¬ 
es  sacrifice  static  typing  (Eiffel  [Mey92],  Ada  9X  [dod91])  and  rely  on  run 
time  checks  to  guarantee  compatibility  of  function  arguments. 

Furthermore,  most  object-oriented  languages  do  not  support  type  recon¬ 
struction;  a  recent  advance  in  type  reconstruction  for  a  Smalltalk-like  lan¬ 
guage  is  presented  in  [PS9 1  ] . 

6.2  Current  State  of  Implementation 

We  have  implemented  a  Standard  ML  prototype  of  an  interpreter  with  type 
reconstruction  for  our  core  language,  Mini-ML  [CDDK86]  extended  with 
recursive  datatypes  over  existentially  quantified  component  types.  The  ML- 
style  examples  from  this  thesis  have  been  developed  and  tested  using  our  in¬ 
terpreter. 

Technically,  the  interpreter  consists  of  the  following  components: 

•  Lexer  and  parser  were  built  using  the  tools  ML-Lex  [AMT89]  and  ML- 
Yacc  [TA91],  respectively. 

•  The  type  reconstruction  phase  is  based  on  [Han87]. 

•  The  evaluator  directly  implements  the  denotational  semantics  present¬ 
ed  in  Section  3.6.2. 

We  plan  further  to  develop  this  prototype  towards  an  interpreter  of  a  full  lan¬ 
guage  based  on  our  extension  of  SML. 

The  latest  releases  of  the  Lazy  ML  [AJ92]  and  Haskell  B.  [Aug92]  sys¬ 
tems  feature  datatypes  with  existentially  quantified  component  types.  Both 
systems  were  developed  at  the  Chalmers  University  of  Technology;  they 
provide  full  compilers  and  interpreters  capable  of  dealing  with  larger  pro¬ 
grams.  The  Haskell  examples  from  this  thesis  have  been  tested  using  the 
Chalmers  Haskell  B.  interpreter. 

6.3  Conclusions 

The  question  we  had  set  out  to  answer  in  this  dissertation  was: 
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Is  it  feasible  to  design  a  high-level  programming  language  that  satis¬ 
fies  criteria  1.  through  6.? 

We  showed  that  such  a  design  is  feasible  from  a  type-theoretic,  a  language 
design,  and  an  implementation  perspective: 

•  Type -theoretic  view:  Static  typing  and  semantic  soundness  of  the  type 
systems  hold  for  all  three  languages  presented.  Furthermore,  we  ex¬ 
tended  the  Damas-Milner  type  reconstruction  algorithm  used  in  ML  to 
cope  with  our  languages. 

•  Language  design  view:  Our  examples  demonstrated  that  we  gain  con¬ 
siderable  expressiveness  and  flexibility  by  adding  first-class  abstract 
types  to  ML  and  Haskell  while  retaining  the  syntactic  and  semantic 
“look  and  feel”  of  the  original  languages. 

•  Implementation  view:  Our  prototype  implementation  shows  that  our 
languages  can  be  implemented  using  standard  techniques  as  the  ones 
described  in  [Han87]  or  used  in  the  Standard  ML  of  New  Jersey  imple¬ 
mentation  [AM92].  The  Chalmers  LML  and  HBC  systems  demonstrate 
that  it  is  feasible  to  implement  our  extensions  in  practical  compilers 
and  interpreters. 

6.4  Future  Work 

Our  work  leads  off  to  a  number  of  future  research  directions,  some  of  which 
are  discussed  below. 

6.4.1  Combination  of  Modules  and  Existential  Quantification 
in  ML 

We  demonstrated  in  Chapter  5  how  Haskell  type  classes  can  be  used  as  sig¬ 
natures  of  abstract  data  types.  The  ML  module  system  also  provides  signa¬ 
tures,  which  are  strong  sum  types.  One  could  imagine  using  these  signatures 
to  describe  interfaces  of  abstract  types.  First-class  abstract  types  could  then 
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be  achieved  by  applying  an  injection  that  makes  the  type  components  of  the 
signature  existentially  quantified,  along  the  lines  of  [MMM91]. 

6.4.2  A  Polymorphic  Pattern-Matching  let  Expression 

An  identifier  bound  using  the  pattern-matching  let  expression  from 
Chapter  5  is  monomorphic,  whereas  an  identifier  bound  by  the  correspond¬ 
ing  expression  from  Chapter  3  can  be  used  polymorphically.  It  would  be  de¬ 
sirable  to  overcome  this  restriction  by  exploring  an  extended  target  lan¬ 
guage,  where  a  function  depending  on  some  method  dictionaries  can  be  de¬ 
composed  before  being  applied  to  any  arguments.  While  unsound  in  the 
general  case,  we  conjecture  that  this  is  sound  in  our  case,  since  the  argu¬ 
ments  are  the  same  whenever  the  bound  identifier  is  used  with  the  same  type. 

6.4.3  Combination  of  Parameterized  Type  Classes  and 
Existential  Types  in  Haskell 

Type  classes  in  Haskell  are  not  parameterized,  thus  we  cannot  model  ab¬ 
stract  container  classes.  This  shortcoming  was  discussed  in  [L091]  and  is 
also  present  in  our  extension  of  Haskell  described  in  Chapter  5;  thus  the 
stack  example  from  Section  5.2.2  is  not  polymorphic.  An  extension  of 
Haskell  with  parameterized  type  classes  was  recently  presented  in  [CH092]; 
it  would  be  desirable  to  apply  the  same  extension  to  our  language.  We  con¬ 
jecture  that  parameterized  type  classes  are  an  orthogonal  extension  and 
combine  well  with  existential  quantification. 

Another  interesting  extension  of  Haskell  is  one  with  a  dotless  dot  nota¬ 
tion  analogous  to  the  ML  extension  from  Chapter  4;  it  appears  that  such  a 
language  could  be  translated  into  the  language  described  in  Chapter  5. 

6.4.4  Existential  Types  and  Mutable  State 

Since  the  full  ML  language  also  provides  polymorphic  references,  an  exten¬ 
sion  of  this  language  with  existential  types  would  depend  on  the  coexistence 
of  existential  types  and  polymorphic  references.  Similar  considerations  hold 
for  other  forms  of  mutable  state  such  as  linear  types  [Ode91]  [Wad90]. 
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6.4.5  Full  Implementation 

Whereas  implementations  of  Lazy  ML  and  Haskell  B.  extended  with  exis¬ 
tential  types  are  now  available,  further  implementation  work  could  be  envi¬ 
sioned  both  at  the  ML  level  and  at  the  Haskell  level. 

At  the  ML  level,  the  language  would  be  strict  and  include  datatypes  with 
existentially  quantified  component  types,  polymorphic  references,  and  pos¬ 
sibly  modules. 

At  the  Haskell  level,  the  language  could  be  strict  or  non-strict  and  in¬ 
clude  existential  quantification  over  parameterized  type  classes.  Alternative 
implementation  strategies  for  Haskell  or  similar  languages  with  type  classes 
could  be  explored;  instead  of  translating  to  an  ML-like  language,  type  class¬ 
es  could  be  mapped  to  C++  templates  [Ode92].  A  possible  starting  point  for 
further  exploration  could  be  an  explicitly  typed  version  of  Mini-Haskell  in 
the  spirit  of  [MH88]. 
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